* Missed interrupt false positives mitigation @ 2016-02-16 11:47 Chris Wilson 2016-02-16 11:47 ` [PATCH 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson ` (5 more replies) 0 siblings, 6 replies; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx Mostly for CI, but a couple need review, -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson @ 2016-02-16 11:47 ` Chris Wilson 2016-02-16 11:47 ` [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson ` (4 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx; +Cc: Daniel Vetter, Mika Kuoppala In order to ensure seqno/irq coherency, we currently read a ring register. The mmio transaction following the interrupt delays the inspection of the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU cache. However, it is only the memory timing that is important for the purposes of the delay, we do not need nor desire the extra forcewake. v3: Update commentary Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] --- drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 45ce45a5e122..abac0560d075 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1553,10 +1553,19 @@ gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) { /* Workaround to force correct ordering between irq and seqno writes on * ivb (and maybe also on snb) by reading from a CS register (like - * ACTHD) before reading the status page. */ + * ACTHD) before reading the status page. + * + * Note that this effectively stalls the read by the time it takes to + * do a memory transaction, which more or less ensures that the write + * from the GPU has sufficient time to invalidate the CPU cacheline. + * Alternatively we could delay the interrupt from the CS ring to give + * the write time to land, but that would incur a delay after every + * batch i.e. much more frequent than a delay when waiting for the + * interrupt (with the same net latency). + */ if (!lazy_coherency) { struct drm_i915_private *dev_priv = ring->dev->dev_private; - POSTING_READ(RING_ACTHD(ring->mmio_base)); + POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); } return intel_read_status_page(ring, I915_GEM_HWS_INDEX); -- 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson 2016-02-16 11:47 ` [PATCH 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson @ 2016-02-16 11:47 ` Chris Wilson 2016-02-16 12:51 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson ` (3 subsequent siblings) 5 siblings, 1 reply; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx; +Cc: Mika Kuoppala In order to simplify future patches, extract the lazy_coherency optimisation our of the engine->get_seqno() vfunc into its own callback. v2: Rename the barrier to engine->irq_seqno_barrier to try and better reflect that the barrier is only required after the user interrupt before reading the seqno (to ensure that the seqno update lands in time as we do not have strict seqno-irq ordering on all platforms). Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2] v3: Comments for hangcheck paranoia. Mika wanted to keep the extra barrier inside the hangcheck, just in case. I can argue that it doesn't provide a barrier against anything, but the side-effects of applying the barrier may prevent a false declaration of a hung GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 6 +++--- drivers/gpu/drm/i915/i915_drv.h | 12 ++++++++---- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/i915_irq.c | 14 ++++++++++++-- drivers/gpu/drm/i915/i915_trace.h | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++----------- drivers/gpu/drm/i915/intel_ringbuffer.c | 34 +++++++++++++++++---------------- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- 8 files changed, 51 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index ec0c2a05eed6..c4df580ed0de 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) ring->name, i915_gem_request_get_seqno(work->flip_queued_req), dev_priv->next_seqno, - ring->get_seqno(ring, true), + ring->get_seqno(ring), i915_gem_request_completed(work->flip_queued_req, true)); } else seq_printf(m, "Flip not associated with any ring\n"); @@ -732,7 +732,7 @@ static void i915_ring_seqno_info(struct seq_file *m, { if (ring->get_seqno) { seq_printf(m, "Current sequence (%s): %x\n", - ring->name, ring->get_seqno(ring, false)); + ring->name, ring->get_seqno(ring)); } } @@ -1342,8 +1342,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); for_each_ring(ring, dev_priv, i) { - seqno[i] = ring->get_seqno(ring, false); acthd[i] = intel_ring_get_active_head(ring); + seqno[i] = ring->get_seqno(ring); } i915_get_extra_instdone(dev, instdone); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 20d9dbd9f9cf..525f39c0551b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3001,15 +3001,19 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2) static inline bool i915_gem_request_started(struct drm_i915_gem_request *req, bool lazy_coherency) { - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); - return i915_seqno_passed(seqno, req->previous_seqno); + if (!lazy_coherency && req->ring->irq_seqno_barrier) + req->ring->irq_seqno_barrier(req->ring); + return i915_seqno_passed(req->ring->get_seqno(req->ring), + req->previous_seqno); } static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req, bool lazy_coherency) { - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); - return i915_seqno_passed(seqno, req->seqno); + if (!lazy_coherency && req->ring->irq_seqno_barrier) + req->ring->irq_seqno_barrier(req->ring); + return i915_seqno_passed(req->ring->get_seqno(req->ring), + req->seqno); } int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 978c026963b8..33f1e0636a03 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -906,8 +906,8 @@ static void i915_record_ring_state(struct drm_device *dev, ering->waiting = waitqueue_active(&ring->irq_queue); ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base)); - ering->seqno = ring->get_seqno(ring, false); ering->acthd = intel_ring_get_active_head(ring); + ering->seqno = ring->get_seqno(ring); ering->start = I915_READ_START(ring); ering->head = I915_READ_HEAD(ring); ering->tail = I915_READ_TAIL(ring); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 25a89373df63..07bc2cdd6252 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2937,7 +2937,7 @@ static int semaphore_passed(struct intel_engine_cs *ring) if (signaller->hangcheck.deadlock >= I915_NUM_RINGS) return -1; - if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno)) + if (i915_seqno_passed(signaller->get_seqno(signaller), seqno)) return 1; /* cursory check for an unkickable deadlock */ @@ -3101,8 +3101,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work) semaphore_clear_deadlocks(dev_priv); - seqno = ring->get_seqno(ring, false); + /* We don't strictly need an irq-barrier here, as we are not + * serving an interrupt request, be paranoid in case the + * barrier has side-effects (such as preventing a broken + * cacheline snoop) and so be sure that we can see the seqno + * advance. If the seqno should stick, due to a stale + * cacheline, we would erroneously declare the GPU hung. + */ + if (ring->irq_seqno_barrier) + ring->irq_seqno_barrier(ring); + acthd = intel_ring_get_active_head(ring); + seqno = ring->get_seqno(ring); if (ring->hangcheck.seqno == seqno) { if (ring_idle(ring, seqno)) { diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 52b2d409945d..cfb5f78a6e84 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -573,7 +573,7 @@ TRACE_EVENT(i915_gem_request_notify, TP_fast_assign( __entry->dev = ring->dev->primary->index; __entry->ring = ring->id; - __entry->seqno = ring->get_seqno(ring, false); + __entry->seqno = ring->get_seqno(ring); ), TP_printk("dev=%u, ring=%u, seqno=%u", diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3a03646e343d..0e833d470c5e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1837,7 +1837,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request, return 0; } -static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static u32 gen8_get_seqno(struct intel_engine_cs *ring) { return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } @@ -1847,9 +1847,8 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno) intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno); } -static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static void bxt_seqno_barrier(struct intel_engine_cs *ring) { - /* * On BXT A steppings there is a HW coherency issue whereby the * MI_STORE_DATA_IMM storing the completed request's seqno @@ -1860,11 +1859,7 @@ static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) * bxt_a_set_seqno(), where we also do a clflush after the write. So * this clflush in practice becomes an invalidate operation. */ - - if (!lazy_coherency) - intel_flush_status_page(ring, I915_GEM_HWS_INDEX); - - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); + intel_flush_status_page(ring, I915_GEM_HWS_INDEX); } static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno) @@ -2034,12 +2029,11 @@ logical_ring_default_vfuncs(struct drm_device *dev, ring->irq_get = gen8_logical_ring_get_irq; ring->irq_put = gen8_logical_ring_put_irq; ring->emit_bb_start = gen8_emit_bb_start; + ring->get_seqno = gen8_get_seqno; + ring->set_seqno = gen8_set_seqno; if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) { - ring->get_seqno = bxt_a_get_seqno; + ring->irq_seqno_barrier = bxt_seqno_barrier; ring->set_seqno = bxt_a_set_seqno; - } else { - ring->get_seqno = gen8_get_seqno; - ring->set_seqno = gen8_set_seqno; } } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index abac0560d075..573203618dfe 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1548,8 +1548,8 @@ pc_render_add_request(struct drm_i915_gem_request *req) return 0; } -static u32 -gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static void +gen6_seqno_barrier(struct intel_engine_cs *ring) { /* Workaround to force correct ordering between irq and seqno writes on * ivb (and maybe also on snb) by reading from a CS register (like @@ -1563,16 +1563,12 @@ gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) * batch i.e. much more frequent than a delay when waiting for the * interrupt (with the same net latency). */ - if (!lazy_coherency) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; - POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); - } - - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); + struct drm_i915_private *dev_priv = to_i915(ring->dev); + POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); } static u32 -ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +ring_get_seqno(struct intel_engine_cs *ring) { return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } @@ -1584,7 +1580,7 @@ ring_set_seqno(struct intel_engine_cs *ring, u32 seqno) } static u32 -pc_render_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +pc_render_get_seqno(struct intel_engine_cs *ring) { return ring->scratch.cpu_page[0]; } @@ -2783,7 +2779,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen8_ring_get_irq; ring->irq_put = gen8_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { WARN_ON(!dev_priv->semaphore_obj); @@ -2800,7 +2797,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen6_ring_get_irq; ring->irq_put = gen6_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { ring->semaphore.sync_to = gen6_ring_sync; @@ -2915,7 +2913,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev) ring->write_tail = gen6_bsd_ring_write_tail; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { ring->irq_enable_mask = @@ -2988,7 +2987,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev) ring->mmio_base = GEN8_BSD2_RING_BASE; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT; @@ -3019,7 +3019,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { ring->irq_enable_mask = @@ -3077,7 +3078,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 566b0ae10ce0..4cea04491392 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -196,8 +196,8 @@ struct intel_engine_cs { * seen value is good enough. Note that the seqno will always be * monotonic, even if not coherent. */ - u32 (*get_seqno)(struct intel_engine_cs *ring, - bool lazy_coherency); + void (*irq_seqno_barrier)(struct intel_engine_cs *ring); + u32 (*get_seqno)(struct intel_engine_cs *ring); void (*set_seqno)(struct intel_engine_cs *ring, u32 seqno); int (*dispatch_execbuffer)(struct drm_i915_gem_request *req, -- 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno 2016-02-16 11:47 ` [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson @ 2016-02-16 12:51 ` Mika Kuoppala 0 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2016-02-16 12:51 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > In order to simplify future patches, extract the > lazy_coherency optimisation our of the engine->get_seqno() vfunc into > its own callback. > > v2: Rename the barrier to engine->irq_seqno_barrier to try and better > reflect that the barrier is only required after the user interrupt before > reading the seqno (to ensure that the seqno update lands in time as we > do not have strict seqno-irq ordering on all platforms). > > Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2] > > v3: Comments for hangcheck paranoia. Mika wanted to keep the extra > barrier inside the hangcheck, just in case. I can argue that it doesn't > provide a barrier against anything, but the side-effects of applying the > barrier may prevent a false declaration of a hung GPU. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Dave Gordon <david.s.gordon@intel.com> > --- Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > drivers/gpu/drm/i915/i915_debugfs.c | 6 +++--- > drivers/gpu/drm/i915/i915_drv.h | 12 ++++++++---- > drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- > drivers/gpu/drm/i915/i915_irq.c | 14 ++++++++++++-- > drivers/gpu/drm/i915/i915_trace.h | 2 +- > drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++----------- > drivers/gpu/drm/i915/intel_ringbuffer.c | 34 +++++++++++++++++---------------- > drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- > 8 files changed, 51 insertions(+), 41 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index ec0c2a05eed6..c4df580ed0de 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) > ring->name, > i915_gem_request_get_seqno(work->flip_queued_req), > dev_priv->next_seqno, > - ring->get_seqno(ring, true), > + ring->get_seqno(ring), > i915_gem_request_completed(work->flip_queued_req, true)); > } else > seq_printf(m, "Flip not associated with any ring\n"); > @@ -732,7 +732,7 @@ static void i915_ring_seqno_info(struct seq_file *m, > { > if (ring->get_seqno) { > seq_printf(m, "Current sequence (%s): %x\n", > - ring->name, ring->get_seqno(ring, false)); > + ring->name, ring->get_seqno(ring)); > } > } > > @@ -1342,8 +1342,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) > intel_runtime_pm_get(dev_priv); > > for_each_ring(ring, dev_priv, i) { > - seqno[i] = ring->get_seqno(ring, false); > acthd[i] = intel_ring_get_active_head(ring); > + seqno[i] = ring->get_seqno(ring); > } > > i915_get_extra_instdone(dev, instdone); > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 20d9dbd9f9cf..525f39c0551b 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -3001,15 +3001,19 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2) > static inline bool i915_gem_request_started(struct drm_i915_gem_request *req, > bool lazy_coherency) > { > - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > - return i915_seqno_passed(seqno, req->previous_seqno); > + if (!lazy_coherency && req->ring->irq_seqno_barrier) > + req->ring->irq_seqno_barrier(req->ring); > + return i915_seqno_passed(req->ring->get_seqno(req->ring), > + req->previous_seqno); > } > > static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req, > bool lazy_coherency) > { > - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > - return i915_seqno_passed(seqno, req->seqno); > + if (!lazy_coherency && req->ring->irq_seqno_barrier) > + req->ring->irq_seqno_barrier(req->ring); > + return i915_seqno_passed(req->ring->get_seqno(req->ring), > + req->seqno); > } > > int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno); > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 978c026963b8..33f1e0636a03 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -906,8 +906,8 @@ static void i915_record_ring_state(struct drm_device *dev, > > ering->waiting = waitqueue_active(&ring->irq_queue); > ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base)); > - ering->seqno = ring->get_seqno(ring, false); > ering->acthd = intel_ring_get_active_head(ring); > + ering->seqno = ring->get_seqno(ring); > ering->start = I915_READ_START(ring); > ering->head = I915_READ_HEAD(ring); > ering->tail = I915_READ_TAIL(ring); > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 25a89373df63..07bc2cdd6252 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -2937,7 +2937,7 @@ static int semaphore_passed(struct intel_engine_cs *ring) > if (signaller->hangcheck.deadlock >= I915_NUM_RINGS) > return -1; > > - if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno)) > + if (i915_seqno_passed(signaller->get_seqno(signaller), seqno)) > return 1; > > /* cursory check for an unkickable deadlock */ > @@ -3101,8 +3101,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > semaphore_clear_deadlocks(dev_priv); > > - seqno = ring->get_seqno(ring, false); > + /* We don't strictly need an irq-barrier here, as we are not > + * serving an interrupt request, be paranoid in case the > + * barrier has side-effects (such as preventing a broken > + * cacheline snoop) and so be sure that we can see the seqno > + * advance. If the seqno should stick, due to a stale > + * cacheline, we would erroneously declare the GPU hung. > + */ > + if (ring->irq_seqno_barrier) > + ring->irq_seqno_barrier(ring); > + > acthd = intel_ring_get_active_head(ring); > + seqno = ring->get_seqno(ring); > > if (ring->hangcheck.seqno == seqno) { > if (ring_idle(ring, seqno)) { > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 52b2d409945d..cfb5f78a6e84 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -573,7 +573,7 @@ TRACE_EVENT(i915_gem_request_notify, > TP_fast_assign( > __entry->dev = ring->dev->primary->index; > __entry->ring = ring->id; > - __entry->seqno = ring->get_seqno(ring, false); > + __entry->seqno = ring->get_seqno(ring); > ), > > TP_printk("dev=%u, ring=%u, seqno=%u", > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 3a03646e343d..0e833d470c5e 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1837,7 +1837,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request, > return 0; > } > > -static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > +static u32 gen8_get_seqno(struct intel_engine_cs *ring) > { > return intel_read_status_page(ring, I915_GEM_HWS_INDEX); > } > @@ -1847,9 +1847,8 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno) > intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno); > } > > -static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > +static void bxt_seqno_barrier(struct intel_engine_cs *ring) > { > - > /* > * On BXT A steppings there is a HW coherency issue whereby the > * MI_STORE_DATA_IMM storing the completed request's seqno > @@ -1860,11 +1859,7 @@ static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > * bxt_a_set_seqno(), where we also do a clflush after the write. So > * this clflush in practice becomes an invalidate operation. > */ > - > - if (!lazy_coherency) > - intel_flush_status_page(ring, I915_GEM_HWS_INDEX); > - > - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); > + intel_flush_status_page(ring, I915_GEM_HWS_INDEX); > } > > static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno) > @@ -2034,12 +2029,11 @@ logical_ring_default_vfuncs(struct drm_device *dev, > ring->irq_get = gen8_logical_ring_get_irq; > ring->irq_put = gen8_logical_ring_put_irq; > ring->emit_bb_start = gen8_emit_bb_start; > + ring->get_seqno = gen8_get_seqno; > + ring->set_seqno = gen8_set_seqno; > if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) { > - ring->get_seqno = bxt_a_get_seqno; > + ring->irq_seqno_barrier = bxt_seqno_barrier; > ring->set_seqno = bxt_a_set_seqno; > - } else { > - ring->get_seqno = gen8_get_seqno; > - ring->set_seqno = gen8_set_seqno; > } > } > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index abac0560d075..573203618dfe 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1548,8 +1548,8 @@ pc_render_add_request(struct drm_i915_gem_request *req) > return 0; > } > > -static u32 > -gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > +static void > +gen6_seqno_barrier(struct intel_engine_cs *ring) > { > /* Workaround to force correct ordering between irq and seqno writes on > * ivb (and maybe also on snb) by reading from a CS register (like > @@ -1563,16 +1563,12 @@ gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > * batch i.e. much more frequent than a delay when waiting for the > * interrupt (with the same net latency). > */ > - if (!lazy_coherency) { > - struct drm_i915_private *dev_priv = ring->dev->dev_private; > - POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); > - } > - > - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); > + struct drm_i915_private *dev_priv = to_i915(ring->dev); > + POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); > } > > static u32 > -ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > +ring_get_seqno(struct intel_engine_cs *ring) > { > return intel_read_status_page(ring, I915_GEM_HWS_INDEX); > } > @@ -1584,7 +1580,7 @@ ring_set_seqno(struct intel_engine_cs *ring, u32 seqno) > } > > static u32 > -pc_render_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) > +pc_render_get_seqno(struct intel_engine_cs *ring) > { > return ring->scratch.cpu_page[0]; > } > @@ -2783,7 +2779,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) > ring->irq_get = gen8_ring_get_irq; > ring->irq_put = gen8_ring_put_irq; > ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > if (i915_semaphore_is_enabled(dev)) { > WARN_ON(!dev_priv->semaphore_obj); > @@ -2800,7 +2797,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) > ring->irq_get = gen6_ring_get_irq; > ring->irq_put = gen6_ring_put_irq; > ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > if (i915_semaphore_is_enabled(dev)) { > ring->semaphore.sync_to = gen6_ring_sync; > @@ -2915,7 +2913,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev) > ring->write_tail = gen6_bsd_ring_write_tail; > ring->flush = gen6_bsd_ring_flush; > ring->add_request = gen6_add_request; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > if (INTEL_INFO(dev)->gen >= 8) { > ring->irq_enable_mask = > @@ -2988,7 +2987,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev) > ring->mmio_base = GEN8_BSD2_RING_BASE; > ring->flush = gen6_bsd_ring_flush; > ring->add_request = gen6_add_request; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > ring->irq_enable_mask = > GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT; > @@ -3019,7 +3019,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev) > ring->write_tail = ring_write_tail; > ring->flush = gen6_ring_flush; > ring->add_request = gen6_add_request; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > if (INTEL_INFO(dev)->gen >= 8) { > ring->irq_enable_mask = > @@ -3077,7 +3078,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev) > ring->write_tail = ring_write_tail; > ring->flush = gen6_ring_flush; > ring->add_request = gen6_add_request; > - ring->get_seqno = gen6_ring_get_seqno; > + ring->irq_seqno_barrier = gen6_seqno_barrier; > + ring->get_seqno = ring_get_seqno; > ring->set_seqno = ring_set_seqno; > > if (INTEL_INFO(dev)->gen >= 8) { > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 566b0ae10ce0..4cea04491392 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -196,8 +196,8 @@ struct intel_engine_cs { > * seen value is good enough. Note that the seqno will always be > * monotonic, even if not coherent. > */ > - u32 (*get_seqno)(struct intel_engine_cs *ring, > - bool lazy_coherency); > + void (*irq_seqno_barrier)(struct intel_engine_cs *ring); > + u32 (*get_seqno)(struct intel_engine_cs *ring); > void (*set_seqno)(struct intel_engine_cs *ring, > u32 seqno); > int (*dispatch_execbuffer)(struct drm_i915_gem_request *req, > -- > 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/5] drm/i915: Harden detection of missed interrupts 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson 2016-02-16 11:47 ` [PATCH 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson 2016-02-16 11:47 ` [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson @ 2016-02-16 11:47 ` Chris Wilson 2016-02-16 12:51 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson ` (2 subsequent siblings) 5 siblings, 1 reply; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx; +Cc: Mika Kuoppala Only declare a missed interrupt if we find that the GPU is idle with waiters and a hangcheck interval has passed in which no new user interrupts have been raised. v2: Clear the stuck interrupt marker between successful batches Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 11 +++++++---- drivers/gpu/drm/i915/i915_irq.c | 10 +++++++++- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index c4df580ed0de..f3ba97ad3e00 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -730,10 +730,10 @@ static int i915_gem_request_info(struct seq_file *m, void *data) static void i915_ring_seqno_info(struct seq_file *m, struct intel_engine_cs *ring) { - if (ring->get_seqno) { - seq_printf(m, "Current sequence (%s): %x\n", - ring->name, ring->get_seqno(ring)); - } + seq_printf(m, "Current sequence (%s): %x\n", + ring->name, ring->get_seqno(ring)); + seq_printf(m, "Current user interrupts (%s): %x\n", + ring->name, READ_ONCE(ring->user_interrupts)); } static int i915_gem_seqno_info(struct seq_file *m, void *data) @@ -1361,6 +1361,9 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) seq_printf(m, "%s:\n", ring->name); seq_printf(m, "\tseqno = %x [current %x]\n", ring->hangcheck.seqno, seqno[i]); + seq_printf(m, "\tuser interrupts = %x [current %x]\n", + ring->hangcheck.user_interrupts, + ring->user_interrupts); seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n", (long long)ring->hangcheck.acthd, (long long)acthd[i]); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 07bc2cdd6252..c0aeff607130 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1000,6 +1000,7 @@ static void notify_ring(struct intel_engine_cs *ring) return; trace_i915_gem_request_notify(ring); + ring->user_interrupts++; wake_up_all(&ring->irq_queue); } @@ -3097,6 +3098,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) for_each_ring(ring, dev_priv, i) { u64 acthd; u32 seqno; + unsigned user_interrupts; bool busy = true; semaphore_clear_deadlocks(dev_priv); @@ -3113,6 +3115,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) acthd = intel_ring_get_active_head(ring); seqno = ring->get_seqno(ring); + user_interrupts = READ_ONCE(ring->user_interrupts); if (ring->hangcheck.seqno == seqno) { if (ring_idle(ring, seqno)) { @@ -3120,7 +3123,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) if (waitqueue_active(&ring->irq_queue)) { /* Issue a wake-up to catch stuck h/w. */ - if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { + if (ring->hangcheck.user_interrupts == user_interrupts && + !test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) DRM_ERROR("Hangcheck timer elapsed... %s idle\n", ring->name); @@ -3183,10 +3187,14 @@ static void i915_hangcheck_elapsed(struct work_struct *work) memset(ring->hangcheck.instdone, 0, sizeof(ring->hangcheck.instdone)); + + /* Reset stuck interrupts between batch advances */ + user_interrupts = 0; } ring->hangcheck.seqno = seqno; ring->hangcheck.acthd = acthd; + ring->hangcheck.user_interrupts = user_interrupts; busy_count += busy; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 4cea04491392..dfb14bfe5bc8 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -90,6 +90,7 @@ struct intel_ring_hangcheck { u64 acthd; u64 max_acthd; u32 seqno; + unsigned user_interrupts; int score; enum intel_ring_hangcheck_action action; int deadlock; @@ -306,6 +307,7 @@ struct intel_engine_cs { * inspecting request list. */ u32 last_submitted_seqno; + unsigned user_interrupts; bool gpu_caches_dirty; -- 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/5] drm/i915: Harden detection of missed interrupts 2016-02-16 11:47 ` [PATCH 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson @ 2016-02-16 12:51 ` Mika Kuoppala 0 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2016-02-16 12:51 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > Only declare a missed interrupt if we find that the GPU is idle with > waiters and a hangcheck interval has passed in which no new user > interrupts have been raised. > > v2: Clear the stuck interrupt marker between successful batches > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > --- > drivers/gpu/drm/i915/i915_debugfs.c | 11 +++++++---- > drivers/gpu/drm/i915/i915_irq.c | 10 +++++++++- > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ > 3 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index c4df580ed0de..f3ba97ad3e00 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -730,10 +730,10 @@ static int i915_gem_request_info(struct seq_file *m, void *data) > static void i915_ring_seqno_info(struct seq_file *m, > struct intel_engine_cs *ring) > { > - if (ring->get_seqno) { > - seq_printf(m, "Current sequence (%s): %x\n", > - ring->name, ring->get_seqno(ring)); > - } > + seq_printf(m, "Current sequence (%s): %x\n", > + ring->name, ring->get_seqno(ring)); > + seq_printf(m, "Current user interrupts (%s): %x\n", > + ring->name, READ_ONCE(ring->user_interrupts)); > } > > static int i915_gem_seqno_info(struct seq_file *m, void *data) > @@ -1361,6 +1361,9 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) > seq_printf(m, "%s:\n", ring->name); > seq_printf(m, "\tseqno = %x [current %x]\n", > ring->hangcheck.seqno, seqno[i]); > + seq_printf(m, "\tuser interrupts = %x [current %x]\n", > + ring->hangcheck.user_interrupts, > + ring->user_interrupts); > seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n", > (long long)ring->hangcheck.acthd, > (long long)acthd[i]); > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 07bc2cdd6252..c0aeff607130 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -1000,6 +1000,7 @@ static void notify_ring(struct intel_engine_cs *ring) > return; > > trace_i915_gem_request_notify(ring); > + ring->user_interrupts++; > > wake_up_all(&ring->irq_queue); > } > @@ -3097,6 +3098,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > for_each_ring(ring, dev_priv, i) { > u64 acthd; > u32 seqno; > + unsigned user_interrupts; > bool busy = true; > > semaphore_clear_deadlocks(dev_priv); > @@ -3113,6 +3115,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > acthd = intel_ring_get_active_head(ring); > seqno = ring->get_seqno(ring); > + user_interrupts = READ_ONCE(ring->user_interrupts); > > if (ring->hangcheck.seqno == seqno) { > if (ring_idle(ring, seqno)) { > @@ -3120,7 +3123,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > if (waitqueue_active(&ring->irq_queue)) { > /* Issue a wake-up to catch stuck h/w. */ > - if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { > + if (ring->hangcheck.user_interrupts == user_interrupts && > + !test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { > if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) > DRM_ERROR("Hangcheck timer elapsed... %s idle\n", > ring->name); > @@ -3183,10 +3187,14 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > memset(ring->hangcheck.instdone, 0, > sizeof(ring->hangcheck.instdone)); > + > + /* Reset stuck interrupts between batch advances */ > + user_interrupts = 0; > } > > ring->hangcheck.seqno = seqno; > ring->hangcheck.acthd = acthd; > + ring->hangcheck.user_interrupts = user_interrupts; > busy_count += busy; > } > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 4cea04491392..dfb14bfe5bc8 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -90,6 +90,7 @@ struct intel_ring_hangcheck { > u64 acthd; > u64 max_acthd; > u32 seqno; > + unsigned user_interrupts; > int score; > enum intel_ring_hangcheck_action action; > int deadlock; > @@ -306,6 +307,7 @@ struct intel_engine_cs { > * inspecting request list. > */ > u32 last_submitted_seqno; > + unsigned user_interrupts; > > bool gpu_caches_dirty; > > -- > 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson ` (2 preceding siblings ...) 2016-02-16 11:47 ` [PATCH 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson @ 2016-02-16 11:47 ` Chris Wilson 2016-02-16 12:58 ` Mika Kuoppala 2016-02-19 13:42 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson 2016-02-16 12:14 ` ✗ Fi.CI.BAT: warning for series starting with [1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork 5 siblings, 2 replies; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx; +Cc: Mika Kuoppala Rather than call a function to compute the matching cachelines and clflush them, just call the clflush *instruction* directly. We also know that we can use the unpatched plain clflush rather than the clflushopt alternative. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/i915/intel_ringbuffer.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index dfb14bfe5bc8..e2b2dc2c2f49 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -393,8 +393,9 @@ intel_ring_sync_index(struct intel_engine_cs *ring, static inline void intel_flush_status_page(struct intel_engine_cs *ring, int reg) { - drm_clflush_virt_range(&ring->status_page.page_addr[reg], - sizeof(uint32_t)); + mb(); + clflush(&ring->status_page.page_addr[reg]); + mb(); } static inline u32 -- 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS 2016-02-16 11:47 ` [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson @ 2016-02-16 12:58 ` Mika Kuoppala 2016-02-19 11:49 ` Chris Wilson 2016-02-19 13:42 ` Mika Kuoppala 1 sibling, 1 reply; 12+ messages in thread From: Mika Kuoppala @ 2016-02-16 12:58 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > Rather than call a function to compute the matching cachelines and > clflush them, just call the clflush *instruction* directly. We also know > that we can use the unpatched plain clflush rather than the clflushopt > alternative. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/i915/intel_ringbuffer.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index dfb14bfe5bc8..e2b2dc2c2f49 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -393,8 +393,9 @@ intel_ring_sync_index(struct intel_engine_cs *ring, > static inline void > intel_flush_status_page(struct intel_engine_cs *ring, int reg) > { > - drm_clflush_virt_range(&ring->status_page.page_addr[reg], > - sizeof(uint32_t)); > + mb(); Checkpatch complains about memory barrier without comments. Don't we need to check cpu_has_clflush? -Mika > + clflush(&ring->status_page.page_addr[reg]); > + mb(); > } > > static inline u32 > -- > 2.7.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS 2016-02-16 12:58 ` Mika Kuoppala @ 2016-02-19 11:49 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2016-02-19 11:49 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx On Tue, Feb 16, 2016 at 02:58:56PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Rather than call a function to compute the matching cachelines and > > clflush them, just call the clflush *instruction* directly. We also know > > that we can use the unpatched plain clflush rather than the clflushopt > > alternative. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > Cc: Imre Deak <imre.deak@intel.com> > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > > index dfb14bfe5bc8..e2b2dc2c2f49 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > @@ -393,8 +393,9 @@ intel_ring_sync_index(struct intel_engine_cs *ring, > > static inline void > > intel_flush_status_page(struct intel_engine_cs *ring, int reg) > > { > > - drm_clflush_virt_range(&ring->status_page.page_addr[reg], > > - sizeof(uint32_t)); > > + mb(); > > Checkpatch complains about memory barrier without comments. > > Don't we need to check cpu_has_clflush? No, these are only called when we know we have clflush. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS 2016-02-16 11:47 ` [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson 2016-02-16 12:58 ` Mika Kuoppala @ 2016-02-19 13:42 ` Mika Kuoppala 1 sibling, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2016-02-19 13:42 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > Rather than call a function to compute the matching cachelines and > clflush them, just call the clflush *instruction* directly. We also know > that we can use the unpatched plain clflush rather than the clflushopt > alternative. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > --- > drivers/gpu/drm/i915/intel_ringbuffer.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index dfb14bfe5bc8..e2b2dc2c2f49 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -393,8 +393,9 @@ intel_ring_sync_index(struct intel_engine_cs *ring, > static inline void > intel_flush_status_page(struct intel_engine_cs *ring, int reg) > { > - drm_clflush_virt_range(&ring->status_page.page_addr[reg], > - sizeof(uint32_t)); > + mb(); > + clflush(&ring->status_page.page_addr[reg]); > + mb(); > } > > static inline u32 > -- > 2.7.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson ` (3 preceding siblings ...) 2016-02-16 11:47 ` [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson @ 2016-02-16 11:47 ` Chris Wilson 2016-02-16 12:14 ` ✗ Fi.CI.BAT: warning for series starting with [1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork 5 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2016-02-16 11:47 UTC (permalink / raw) To: intel-gfx; +Cc: Daniel Vetter, Mika Kuoppala When reading from the HWS page, we use barrier() to prevent the compiler optimising away the read from the volatile (may be updated by the GPU) memory address. This is more suited to READ_ONCE(); make it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> --- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index e2b2dc2c2f49..231663a76387 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -403,8 +403,7 @@ intel_read_status_page(struct intel_engine_cs *ring, int reg) { /* Ensure that the compiler doesn't optimize away the load. */ - barrier(); - return ring->status_page.page_addr[reg]; + return READ_ONCE(ring->status_page.page_addr[reg]); } static inline void -- 2.7.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✗ Fi.CI.BAT: warning for series starting with [1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson ` (4 preceding siblings ...) 2016-02-16 11:47 ` [PATCH 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson @ 2016-02-16 12:14 ` Patchwork 5 siblings, 0 replies; 12+ messages in thread From: Patchwork @ 2016-02-16 12:14 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Summary == Series 3479v1 Series without cover letter http://patchwork.freedesktop.org/api/1.0/series/3479/revisions/1/mbox/ Test gem_ringfill: Subgroup basic-default-hang: incomplete -> PASS (snb-dellxps) Test gem_storedw_loop: Subgroup basic-blt: pass -> DMESG-WARN (hsw-brixbox) Subgroup basic-default: pass -> DMESG-WARN (hsw-brixbox) Test gem_sync: Subgroup basic-bsd: dmesg-fail -> PASS (ilk-hp8440p) Test kms_flip: Subgroup basic-flip-vs-dpms: dmesg-warn -> PASS (ilk-hp8440p) UNSTABLE Test pm_rpm: Subgroup basic-pci-d3-state: dmesg-warn -> PASS (byt-nuc) bdw-nuci7 total:162 pass:152 dwarn:0 dfail:0 fail:0 skip:10 bdw-ultra total:165 pass:152 dwarn:0 dfail:0 fail:0 skip:13 bsw-nuc-2 total:165 pass:135 dwarn:1 dfail:0 fail:0 skip:29 byt-nuc total:165 pass:141 dwarn:0 dfail:0 fail:0 skip:24 hsw-brixbox total:165 pass:149 dwarn:2 dfail:0 fail:0 skip:14 hsw-gt2 total:165 pass:154 dwarn:0 dfail:0 fail:1 skip:10 ilk-hp8440p total:165 pass:116 dwarn:0 dfail:0 fail:1 skip:48 ivb-t430s total:165 pass:150 dwarn:0 dfail:0 fail:1 skip:14 skl-i5k-2 total:165 pass:150 dwarn:0 dfail:0 fail:0 skip:15 snb-dellxps total:165 pass:142 dwarn:0 dfail:0 fail:1 skip:22 snb-x220t total:165 pass:142 dwarn:0 dfail:0 fail:2 skip:21 Results at /archive/results/CI_IGT_test/Patchwork_1413/ 63cbdd1816fd78d404ed004b0f931c497625e0df drm-intel-nightly: 2016y-02m-16d-09h-41m-02s UTC integration manifest 4970c2e5d6ea6531a56a6af34c1fb5a5e5573c32 drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor ef657bc04f06d1cb02ac8280311007349645cbf2 drm/i915: Use simplest form for flushing the single cacheline in the HWS 03004bc93407cf1f9b78d7197269bd694f8d4a23 drm/i915: Harden detection of missed interrupts 1979057827b8483dbd6a081eff6ac31cda07dc2d drm/i915: Separate out the seqno-barrier from engine->get_seqno b7062a5bbfc649ad265037f0c8cea869185e6471 drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-02-19 13:44 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-16 11:47 Missed interrupt false positives mitigation Chris Wilson 2016-02-16 11:47 ` [PATCH 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson 2016-02-16 11:47 ` [PATCH 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson 2016-02-16 12:51 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson 2016-02-16 12:51 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson 2016-02-16 12:58 ` Mika Kuoppala 2016-02-19 11:49 ` Chris Wilson 2016-02-19 13:42 ` Mika Kuoppala 2016-02-16 11:47 ` [PATCH 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson 2016-02-16 12:14 ` ✗ Fi.CI.BAT: warning for series starting with [1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).