* [PATCH] drm/i915: Stop requesting error_state reports.
@ 2015-02-10 19:27 Rodrigo Vivi
2015-02-10 21:14 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Rodrigo Vivi @ 2015-02-10 19:27 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
These error states are great to know gpu state when it hangs.
But since we don't have automated tools to do analysis we are
facing much noise on bugzilla with end users reporting just
because "log asked to", while gpu reset worked and users probably
never notice any screen issue. Most of these reportes don't know
when it happened or how to retrigger the issue and somethimes
they are not even on the mood to retest again.
So, let's minimize our and end user's noise and protect this smaller
message with drm.debug. Developers, OSVs and users that face
real screen issue (should) always enabled this debug and will see
the message when error state got dumped.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 48ddbf4..77d63be 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1297,11 +1297,7 @@ void i915_capture_error_state(struct drm_device *dev, bool wedged,
}
if (!warned) {
- DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
- DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
- DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
- DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
- DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index);
+ DRM_DEBUG_DRIVER("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index);
warned = true;
}
}
--
1.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 19:27 [PATCH] drm/i915: Stop requesting error_state reports Rodrigo Vivi @ 2015-02-10 21:14 ` Chris Wilson 2015-02-10 22:02 ` Daniel Vetter 2015-02-11 5:45 ` shuang.he 2 siblings, 0 replies; 7+ messages in thread From: Chris Wilson @ 2015-02-10 21:14 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx On Tue, Feb 10, 2015 at 11:27:30AM -0800, Rodrigo Vivi wrote: > These error states are great to know gpu state when it hangs. > > But since we don't have automated tools to do analysis we are > facing much noise on bugzilla with end users reporting just > because "log asked to", while gpu reset worked and users probably > never notice any screen issue. Other than the corruption and the machine stuttering for a number of seconds, sometimes many times per hour. I think that justifies having some form of user visible message about what just happened. > Most of these reportes don't know > when it happened or how to retrigger the issue and somethimes > they are not even on the mood to retest again. It is the goal of the error state to capture exactly enough information for post-mortem analysis. As you feel it is inadequate, please augment it. > So, let's minimize our and end user's noise and protect this smaller > message with drm.debug. Developers, OSVs and users that face > real screen issue (should) always enabled this debug and will see > the message when error state got dumped. I don't think the tradeoff is worth it. If you have automatic bug reporting, then yes you can remove the plea to have the user do it, but not before. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 19:27 [PATCH] drm/i915: Stop requesting error_state reports Rodrigo Vivi 2015-02-10 21:14 ` Chris Wilson @ 2015-02-10 22:02 ` Daniel Vetter 2015-02-10 22:32 ` Rodrigo Vivi 2015-02-11 5:45 ` shuang.he 2 siblings, 1 reply; 7+ messages in thread From: Daniel Vetter @ 2015-02-10 22:02 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx On Tue, Feb 10, 2015 at 11:27:30AM -0800, Rodrigo Vivi wrote: > These error states are great to know gpu state when it hangs. > > But since we don't have automated tools to do analysis we are > facing much noise on bugzilla with end users reporting just > because "log asked to", while gpu reset worked and users probably > never notice any screen issue. Most of these reportes don't know > when it happened or how to retrigger the issue and somethimes > they are not even on the mood to retest again. Hm, maybe we should reword it to make sure we only get good testers? Instead of "Please file ..." do a "If you can build&test kernels and see other issues together with this gpu hang notice file ..."? I agree with Chris that we can't just mute these, overall the information is imo valuable. We just need to get better at filtering them, and have better information in the error states. E.g. with dri3 the pid/commi is always the one for X, Mika has a small patch to fix that. Closing our eyes won't make the bugs go away. -Daniel > > So, let's minimize our and end user's noise and protect this smaller > message with drm.debug. Developers, OSVs and users that face > real screen issue (should) always enabled this debug and will see > the message when error state got dumped. > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 48ddbf4..77d63be 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -1297,11 +1297,7 @@ void i915_capture_error_state(struct drm_device *dev, bool wedged, > } > > if (!warned) { > - DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n"); > - DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n"); > - DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n"); > - DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n"); > - DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index); > + DRM_DEBUG_DRIVER("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index); > warned = true; > } > } > -- > 1.9.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 22:02 ` Daniel Vetter @ 2015-02-10 22:32 ` Rodrigo Vivi 2015-02-10 22:51 ` Daniel Vetter 2015-02-11 9:14 ` Jani Nikula 0 siblings, 2 replies; 7+ messages in thread From: Rodrigo Vivi @ 2015-02-10 22:32 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Tue, 2015-02-10 at 23:02 +0100, Daniel Vetter wrote: > On Tue, Feb 10, 2015 at 11:27:30AM -0800, Rodrigo Vivi wrote: > > These error states are great to know gpu state when it hangs. > > > > But since we don't have automated tools to do analysis we are > > facing much noise on bugzilla with end users reporting just > > because "log asked to", while gpu reset worked and users probably > > never notice any screen issue. Most of these reportes don't know > > when it happened or how to retrigger the issue and somethimes > > they are not even on the mood to retest again. > > Hm, maybe we should reword it to make sure we only get good testers? > > Instead of "Please file ..." do a "If you can build&test kernels and see > other issues together with this gpu hang notice file ..."? This is just one thing. Some OSVs complains we have to much noise for end users. So 2 noises: bugzilla and logs. > > I agree with Chris that we can't just mute these, overall the information > is imo valuable. We just need to get better at filtering them, and have > better information in the error states. E.g. with dri3 the pid/commi is > always the one for X, Mika has a small patch to fix that. I do agree this error state is valuable. I really like it. > > Closing our eyes won't make the bugs go away. Indeed. But this patch doesn't intend to close the eyes, but just open when it has to be opened, i.e. when drm.debug is set. When users face a issue that bother/matter he would enabled debug and than we would receive the report. And QA/OSVs/Devs should always let drm.debug enabled in certain level anyway. > -Daniel Thanks, Rodrigo. > > > > > So, let's minimize our and end user's noise and protect this smaller > > message with drm.debug. Developers, OSVs and users that face > > real screen issue (should) always enabled this debug and will see > > the message when error state got dumped. > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gpu_error.c | 6 +----- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > > index 48ddbf4..77d63be 100644 > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > @@ -1297,11 +1297,7 @@ void i915_capture_error_state(struct drm_device *dev, bool wedged, > > } > > > > if (!warned) { > > - DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n"); > > - DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n"); > > - DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n"); > > - DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n"); > > - DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index); > > + DRM_DEBUG_DRIVER("GPU crash dump saved to /sys/class/drm/card%d/error\n", dev->primary->index); > > warned = true; > > } > > } > > -- > > 1.9.3 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 22:32 ` Rodrigo Vivi @ 2015-02-10 22:51 ` Daniel Vetter 2015-02-11 9:14 ` Jani Nikula 1 sibling, 0 replies; 7+ messages in thread From: Daniel Vetter @ 2015-02-10 22:51 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx On Tue, Feb 10, 2015 at 02:32:48PM -0800, Rodrigo Vivi wrote: > On Tue, 2015-02-10 at 23:02 +0100, Daniel Vetter wrote: > > On Tue, Feb 10, 2015 at 11:27:30AM -0800, Rodrigo Vivi wrote: > > > These error states are great to know gpu state when it hangs. > > > > > > But since we don't have automated tools to do analysis we are > > > facing much noise on bugzilla with end users reporting just > > > because "log asked to", while gpu reset worked and users probably > > > never notice any screen issue. Most of these reportes don't know > > > when it happened or how to retrigger the issue and somethimes > > > they are not even on the mood to retest again. > > > > Hm, maybe we should reword it to make sure we only get good testers? > > > > Instead of "Please file ..." do a "If you can build&test kernels and see > > other issues together with this gpu hang notice file ..."? > > This is just one thing. Some OSVs complains we have to much noise for > end users. So 2 noises: bugzilla and logs. OSV noise is a different issue. We have already have i915.verbose_state_checks for that, maybe we need to add another one for gpu hangs. But imo that should actually come from OSV, not development (like Rob Clark has done for the state checker). > > I agree with Chris that we can't just mute these, overall the information > > is imo valuable. We just need to get better at filtering them, and have > > better information in the error states. E.g. with dri3 the pid/commi is > > always the one for X, Mika has a small patch to fix that. > > I do agree this error state is valuable. I really like it. > > > > > Closing our eyes won't make the bugs go away. > > Indeed. But this patch doesn't intend to close the eyes, but just open > when it has to be opened, i.e. when drm.debug is set. Imo drm.debug is too silent, we've had that ages ago and it resulted in lots of unecessary roundtrips in bug reports because reporters almost never attach the error state if there's a gpu hang. So that imo doesn't help either with reducing bug team workload. The message is this verbose because a single line was not good enough ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 22:32 ` Rodrigo Vivi 2015-02-10 22:51 ` Daniel Vetter @ 2015-02-11 9:14 ` Jani Nikula 1 sibling, 0 replies; 7+ messages in thread From: Jani Nikula @ 2015-02-11 9:14 UTC (permalink / raw) To: Rodrigo Vivi, Daniel Vetter; +Cc: intel-gfx On Wed, 11 Feb 2015, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote: > When users face a issue that bother/matter he would enabled debug and > than we would receive the report. And QA/OSVs/Devs should always let > drm.debug enabled in certain level anyway. Judging by the bug reports, most users won't enable drm.debug when they face issues and report bugs, if they attach dmesg at all. If a GPU hang does not warrant an error in the logs, what does? BR, Jani. -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop requesting error_state reports. 2015-02-10 19:27 [PATCH] drm/i915: Stop requesting error_state reports Rodrigo Vivi 2015-02-10 21:14 ` Chris Wilson 2015-02-10 22:02 ` Daniel Vetter @ 2015-02-11 5:45 ` shuang.he 2 siblings, 0 replies; 7+ messages in thread From: shuang.he @ 2015-02-11 5:45 UTC (permalink / raw) To: shuang.he, ethan.gao, intel-gfx, rodrigo.vivi Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) Task id: 5751 -------------------------------------Summary------------------------------------- Platform Delta drm-intel-nightly Series Applied PNV +4-4 275/283 275/283 ILK 310/315 310/315 SNB +3 320/346 323/346 IVB -1 380/384 379/384 BYT 296/296 296/296 HSW +3-1 422/428 424/428 BDW 318/333 318/333 -------------------------------------Detailed------------------------------------- Platform Test drm-intel-nightly Series Applied *PNV igt_gem_fence_thrash_bo-write-verify-none PASS(4, M7) FAIL(1, M7) *PNV igt_gem_fence_thrash_bo-write-verify-x PASS(4, M7) FAIL(1, M7) *PNV igt_gem_fence_thrash_bo-write-verify-y PASS(5, M7) FAIL(1, M7) PNV igt_gem_userptr_blits_coherency-sync CRASH(4, M7)PASS(2, M7) PASS(1, M7) PNV igt_gem_userptr_blits_coherency-unsync CRASH(4, M7)PASS(3, M7) PASS(1, M7) PNV igt_gem_userptr_blits_create-destroy-sync NRUN(1, M7)PASS(7, M7) PASS(1, M7) *PNV igt_gem_userptr_blits_forked-unsync-swapping-mempressure-interruptible PASS(2, M7) NO_RESULT(1, M7) PNV igt_gen3_render_tiledx_blits FAIL(3, M7)TIMEOUT(1, M7)PASS(4, M7) FAIL(1, M7) PNV igt_gen3_render_tiledy_blits FAIL(3, M7)PASS(3, M7) PASS(1, M7) *SNB igt_kms_flip_bo-too-big BLACKLIST(1, M35) PASS(1, M35) *SNB igt_kms_flip_bo-too-big-interruptible BLACKLIST(1, M35) PASS(1, M35) *SNB igt_kms_flip_event_leak NSPT(5, M35) PASS(1, M35) *IVB igt_gem_storedw_batches_loop_secure-dispatch PASS(2, M4) DMESG_WARN(1, M4) *HSW igt_gem_pwrite_pread_snooped-pwrite-blt-cpu_mmap-performance PASS(3, M40) DMESG_WARN(1, M40) *HSW igt_kms_flip_bo-too-big BLACKLIST(1, M40) PASS(1, M40) *HSW igt_kms_flip_bo-too-big-interruptible BLACKLIST(1, M40) PASS(1, M40) HSW igt_kms_flip_plain-flip-fb-recreate-interruptible TIMEOUT(5, M40)PASS(4, M40) PASS(1, M40) Note: You need to pay more attention to line start with '*' _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-02-11 9:13 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-10 19:27 [PATCH] drm/i915: Stop requesting error_state reports Rodrigo Vivi 2015-02-10 21:14 ` Chris Wilson 2015-02-10 22:02 ` Daniel Vetter 2015-02-10 22:32 ` Rodrigo Vivi 2015-02-10 22:51 ` Daniel Vetter 2015-02-11 9:14 ` Jani Nikula 2015-02-11 5:45 ` shuang.he
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox