From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 04/13] drm/i915: Make semaphore updates more precise Date: Wed, 30 Apr 2014 14:45:45 +0200 Message-ID: <20140430124545.GG20800@phenom.ffwll.local> References: <1398808360-3674-1-git-send-email-benjamin.widawsky@intel.com> <1398808360-3674-5-git-send-email-benjamin.widawsky@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-ee0-f43.google.com (mail-ee0-f43.google.com [74.125.83.43]) by gabe.freedesktop.org (Postfix) with ESMTP id E30D66EBE9 for ; Wed, 30 Apr 2014 05:45:49 -0700 (PDT) Received: by mail-ee0-f43.google.com with SMTP id e51so1324790eek.16 for ; Wed, 30 Apr 2014 05:45:49 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1398808360-3674-5-git-send-email-benjamin.widawsky@intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Ben Widawsky Cc: Intel GFX List-Id: intel-gfx@lists.freedesktop.org On Tue, Apr 29, 2014 at 02:52:31PM -0700, Ben Widawsky wrote: > With the ring mask we now have an easy way to know the number of rings > in the system, and therefore can accurately predict the number of dwords > to emit for semaphore signalling. This was not possible (easily) > previously. > = > There should be no functional impact, simply fewer instructions emitted. > = > While we're here, simply do the round up to 2 instead of the fancier > rounding we did before, which rounding up per mbox, ie 4. This also > allows us to drop the unnecessary MI_NOOP, so not really 4, 3. > = > v2: Use 3 dwords instead of 4 (Ville) > Do the proper calculation to get the number of dwords to emit (Ville) > Conditionally set .sync_to when semaphores are enabled (Ville) > = > v3: Rebased on VCS2 > Replace hweight_long with hweight32 (Ville) > = > Reviewed-by: Ville Syrj=E4l=E4 (v1) > Signed-off-by: Ben Widawsky > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 173 +++++++++++++++++---------= ------ > 1 file changed, 90 insertions(+), 83 deletions(-) > = > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i9= 15/intel_ringbuffer.c > index e0c7bf2..7aedc0c 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -666,24 +666,19 @@ static void render_ring_cleanup(struct intel_ring_b= uffer *ring) > static int gen6_signal(struct intel_ring_buffer *signaller, > unsigned int num_dwords) > { > +#define MBOX_UPDATE_DWORDS 3 > struct drm_device *dev =3D signaller->dev; > struct drm_i915_private *dev_priv =3D dev->dev_private; > struct intel_ring_buffer *useless; > - int i, ret; > + int i, ret, num_rings; > = > - /* NB: In order to be able to do semaphore MBOX updates for varying > - * number of rings, it's easiest if we round up each individual update > - * to a multiple of 2 (since ring updates must always be a multiple of > - * 2) even though the actual update only requires 3 dwords. > - */ > -#define MBOX_UPDATE_DWORDS 4 > - if (i915_semaphore_is_enabled(dev)) > - num_dwords +=3D ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS); > + num_rings =3D hweight32(INTEL_INFO(dev)->ring_mask); > + num_dwords +=3D round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2); > +#undef MBOX_UPDATE_DWORDS > = > ret =3D intel_ring_begin(signaller, num_dwords); > if (ret) > return ret; > -#undef MBOX_UPDATE_DWORDS > = > for_each_ring(useless, dev_priv, i) { > u32 mbox_reg =3D signaller->semaphore.mbox.signal[i]; > @@ -691,15 +686,13 @@ static int gen6_signal(struct intel_ring_buffer *si= gnaller, > intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1)); > intel_ring_emit(signaller, mbox_reg); > intel_ring_emit(signaller, signaller->outstanding_lazy_seqno); > - intel_ring_emit(signaller, MI_NOOP); > - } else { > - intel_ring_emit(signaller, MI_NOOP); > - intel_ring_emit(signaller, MI_NOOP); > - intel_ring_emit(signaller, MI_NOOP); > - intel_ring_emit(signaller, MI_NOOP); > } > } > = > + /* If num_dwords was rounded, make sure the tail pointer is correct */ > + if (num_rings % 2 =3D=3D 0) > + intel_ring_emit(signaller, MI_NOOP); > + > return 0; > } > = > @@ -717,7 +710,11 @@ gen6_add_request(struct intel_ring_buffer *ring) > { > int ret; > = > - ret =3D ring->semaphore.signal(ring, 4); > + if (ring->semaphore.signal) > + ret =3D ring->semaphore.signal(ring, 4); > + else > + ret =3D intel_ring_begin(ring, 4); > + > if (ret) > return ret; > = The hunks below look like a different patch. Accidental squash while rebasing? I've merged patches 1-3 of this series already. -Daniel > @@ -1928,24 +1925,27 @@ int intel_init_render_ring_buffer(struct drm_devi= ce *dev) > ring->irq_enable_mask =3D GT_RENDER_USER_INTERRUPT; > ring->get_seqno =3D gen6_ring_get_seqno; > ring->set_seqno =3D ring_set_seqno; > - ring->semaphore.sync_to =3D gen6_ring_sync; > - ring->semaphore.signal =3D gen6_signal; > - /* > - * The current semaphore is only applied on pre-gen8 platform. > - * And there is no VCS2 ring on the pre-gen8 platform. So the > - * semaphore between RCS and VCS2 is initialized as INVALID. > - * Gen8 will initialize the sema between VCS2 and RCS later. > - */ > - ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_RV; > - ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_RB; > - ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_RVE; > - ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.signal[RCS] =3D GEN6_NOSYNC; > - ring->semaphore.mbox.signal[VCS] =3D GEN6_VRSYNC; > - ring->semaphore.mbox.signal[BCS] =3D GEN6_BRSYNC; > - ring->semaphore.mbox.signal[VECS] =3D GEN6_VERSYNC; > - ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + if (i915_semaphore_is_enabled(dev)) { > + ring->semaphore.sync_to =3D gen6_ring_sync; > + ring->semaphore.signal =3D gen6_signal; > + /* > + * The current semaphore is only applied on pre-gen8 > + * platform. And there is no VCS2 ring on the pre-gen8 > + * platform. So the semaphore between RCS and VCS2 is > + * initialized as INVALID. Gen8 will initialize the > + * sema between VCS2 and RCS later. > + */ > + ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_RV; > + ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_RB; > + ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_RVE; > + ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.signal[RCS] =3D GEN6_NOSYNC; > + ring->semaphore.mbox.signal[VCS] =3D GEN6_VRSYNC; > + ring->semaphore.mbox.signal[BCS] =3D GEN6_BRSYNC; > + ring->semaphore.mbox.signal[VECS] =3D GEN6_VERSYNC; > + ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + } > } else if (IS_GEN5(dev)) { > ring->add_request =3D pc_render_add_request; > ring->flush =3D gen4_render_ring_flush; > @@ -2113,24 +2113,27 @@ int intel_init_bsd_ring_buffer(struct drm_device = *dev) > ring->dispatch_execbuffer =3D > gen6_ring_dispatch_execbuffer; > } > - ring->semaphore.sync_to =3D gen6_ring_sync; > - ring->semaphore.signal =3D gen6_signal; > - /* > - * The current semaphore is only applied on pre-gen8 platform. > - * And there is no VCS2 ring on the pre-gen8 platform. So the > - * semaphore between VCS and VCS2 is initialized as INVALID. > - * Gen8 will initialize the sema between VCS2 and VCS later. > - */ > - ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_VR; > - ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_VB; > - ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_VVE; > - ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.signal[RCS] =3D GEN6_RVSYNC; > - ring->semaphore.mbox.signal[VCS] =3D GEN6_NOSYNC; > - ring->semaphore.mbox.signal[BCS] =3D GEN6_BVSYNC; > - ring->semaphore.mbox.signal[VECS] =3D GEN6_VEVSYNC; > - ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + if (i915_semaphore_is_enabled(dev)) { > + ring->semaphore.sync_to =3D gen6_ring_sync; > + ring->semaphore.signal =3D gen6_signal; > + /* > + * The current semaphore is only applied on pre-gen8 > + * platform. And there is no VCS2 ring on the pre-gen8 > + * platform. So the semaphore between VCS and VCS2 is > + * initialized as INVALID. Gen8 will initialize the > + * sema between VCS2 and VCS later. > + */ > + ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_VR; > + ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_VB; > + ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_VVE; > + ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.signal[RCS] =3D GEN6_RVSYNC; > + ring->semaphore.mbox.signal[VCS] =3D GEN6_NOSYNC; > + ring->semaphore.mbox.signal[BCS] =3D GEN6_BVSYNC; > + ring->semaphore.mbox.signal[VECS] =3D GEN6_VEVSYNC; > + ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + } > } else { > ring->mmio_base =3D BSD_RING_BASE; > ring->flush =3D bsd_ring_flush; > @@ -2231,24 +2234,26 @@ int intel_init_blt_ring_buffer(struct drm_device = *dev) > ring->irq_put =3D gen6_ring_put_irq; > ring->dispatch_execbuffer =3D gen6_ring_dispatch_execbuffer; > } > - ring->semaphore.sync_to =3D gen6_ring_sync; > - ring->semaphore.signal =3D gen6_signal; > - /* > - * The current semaphore is only applied on pre-gen8 platform. And > - * there is no VCS2 ring on the pre-gen8 platform. So the semaphore > - * between BCS and VCS2 is initialized as INVALID. > - * Gen8 will initialize the sema between BCS and VCS2 later. > - */ > - ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_BR; > - ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_BV; > - ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_BVE; > - ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.signal[RCS] =3D GEN6_RBSYNC; > - ring->semaphore.mbox.signal[VCS] =3D GEN6_VBSYNC; > - ring->semaphore.mbox.signal[BCS] =3D GEN6_NOSYNC; > - ring->semaphore.mbox.signal[VECS] =3D GEN6_VEBSYNC; > - ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + if (i915_semaphore_is_enabled(dev)) { > + ring->semaphore.signal =3D gen6_signal; > + ring->semaphore.sync_to =3D gen6_ring_sync; > + /* > + * The current semaphore is only applied on pre-gen8 platform. > + * And there is no VCS2 ring on the pre-gen8 platform. So the > + * semaphore between BCS and VCS2 is initialized as INVALID. > + * Gen8 will initialize the sema between BCS and VCS2 later. > + */ > + ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_BR; > + ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_BV; > + ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_BVE; > + ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.signal[RCS] =3D GEN6_RBSYNC; > + ring->semaphore.mbox.signal[VCS] =3D GEN6_VBSYNC; > + ring->semaphore.mbox.signal[BCS] =3D GEN6_NOSYNC; > + ring->semaphore.mbox.signal[VECS] =3D GEN6_VEBSYNC; > + ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + } > ring->init =3D init_ring_common; > = > return intel_init_ring_buffer(dev, ring); > @@ -2281,18 +2286,20 @@ int intel_init_vebox_ring_buffer(struct drm_devic= e *dev) > ring->irq_put =3D hsw_vebox_put_irq; > ring->dispatch_execbuffer =3D gen6_ring_dispatch_execbuffer; > } > - ring->semaphore.sync_to =3D gen6_ring_sync; > - ring->semaphore.signal =3D gen6_signal; > - ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_VER; > - ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_VEV; > - ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_VEB; > - ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > - ring->semaphore.mbox.signal[RCS] =3D GEN6_RVESYNC; > - ring->semaphore.mbox.signal[VCS] =3D GEN6_VVESYNC; > - ring->semaphore.mbox.signal[BCS] =3D GEN6_BVESYNC; > - ring->semaphore.mbox.signal[VECS] =3D GEN6_NOSYNC; > - ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + if (i915_semaphore_is_enabled(dev)) { > + ring->semaphore.sync_to =3D gen6_ring_sync; > + ring->semaphore.signal =3D gen6_signal; > + ring->semaphore.mbox.wait[RCS] =3D MI_SEMAPHORE_SYNC_VER; > + ring->semaphore.mbox.wait[VCS] =3D MI_SEMAPHORE_SYNC_VEV; > + ring->semaphore.mbox.wait[BCS] =3D MI_SEMAPHORE_SYNC_VEB; > + ring->semaphore.mbox.wait[VECS] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.wait[VCS2] =3D MI_SEMAPHORE_SYNC_INVALID; > + ring->semaphore.mbox.signal[RCS] =3D GEN6_RVESYNC; > + ring->semaphore.mbox.signal[VCS] =3D GEN6_VVESYNC; > + ring->semaphore.mbox.signal[BCS] =3D GEN6_BVESYNC; > + ring->semaphore.mbox.signal[VECS] =3D GEN6_NOSYNC; > + ring->semaphore.mbox.signal[VCS2] =3D GEN6_NOSYNC; > + } > ring->init =3D init_ring_common; > = > return intel_init_ring_buffer(dev, ring); > -- = > 1.9.2 > = > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- = Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch