From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Farnsworth Subject: Re: [PATCH] drm/i915: Flush outstanding unpin tasks before pageflipping Date: Mon, 05 Nov 2012 11:36:16 +0000 Message-ID: <2100211.dJA6K4njkZ@f17simon> References: <1351761986-27982-1-git-send-email-chris@chris-wilson.co.uk> <3830942.fjPon6j4qR@deuteros> <20121101095851.0dd58c9e@jbarnes-desktop> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0752560625==" Return-path: Received: from claranet-outbound-smtp01.uk.clara.net (claranet-outbound-smtp01.uk.clara.net [195.8.89.34]) by gabe.freedesktop.org (Postfix) with ESMTP id 160C39E7F2 for ; Mon, 5 Nov 2012 03:36:24 -0800 (PST) In-Reply-To: <20121101095851.0dd58c9e@jbarnes-desktop> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: intel-gfx@lists.freedesktop.org Cc: Tvrtko Ursulin List-Id: intel-gfx@lists.freedesktop.org --===============0752560625== Content-Type: multipart/signed; boundary="nextPart8251759.5gnlGLxlcf"; micalg="pgp-sha1"; protocol="application/pgp-signature" Content-Transfer-Encoding: 7Bit --nextPart8251759.5gnlGLxlcf Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" On Thursday 1 November 2012 09:58:51 Jesse Barnes wrote: > On Thu, 01 Nov 2012 16:52:05 +0000 > Tvrtko Ursulin wrote: > > > On Thursday 01 November 2012 16:20:03 Chris Wilson wrote: > > > On Thu, 1 Nov 2012 09:04:02 -0700, Jesse Barnes > > wrote: > > > > On Thu, 01 Nov 2012 15:52:23 +0000 > > > > > > > > Chris Wilson wrote: > > > > > Actually I've justified the blocking here to myself, and prefer it to > > > > > simply running the crtc->unpin_work. If userspace is swamping the system > > > > > so badly that we can run the kthreads quick enough, it deserves a stall. > > > > > Note that the unpin leak is still about the 3rd most common bug in > > > > > fedora, > > > > > so this stall will be forced on many machines. > > > > > > > > Hm funky, why does Fedora hit it so much? Does some of the GNOME shell > > > > stuff run unthrottled or something? > > > > > > I don't think so. I trust that in Tvrtko's use case, he is not so much as > > > hogging the GPU as keeping the system as a whole relatively busy. So I > > > suspect it is more to do with CPU starvation of the kthreads than > > > anything else. > > > > > > Tvrtko, do you have any feeling for why your machine was easily > > > suspectible to this leak? Are the stalls noticeable and do they affect > > > your performance targets? > > > > We didn't bother looking for any stalls, but for a long time we were > > occasionally hitting this pin_count BUG i915_gem_object_pin. So it didn't in > > fact affect our performance targets as much it completely wrecked our system. > > > > If this patch causes an occasional stall instead, given that this bug triggers > > every 3-4 hours of uptime, we are fine with that. If a frame or so is missed > > every couple hours on low end hardware we don't care that much. > > > > More on the actual workload... > > > > Only recently we got lucky and found a platform and workload where it happens > > reliably. And this patch reliably fixes that. > > > > In this workload CPU is being loaded 50-60% decoding a movie and rendering it > > to a full screen window. Our proprietary compositor page flips at 60Hz only, > > not faster. Together with another small semi-transparent window being rendered > > on top of the full screen movie. Movie played is a 25fps one, which means the > > full screen window is damaged 25 out of 60 frames (give or take) which is when > > we render to our back buffer and page flip at the vsync rate (60Hz). > > > > According to intel_gpu_top tool, GPU load is roughly at 40%, apart from the > > "Framebuffer Compression" metric which is maxed out, if that is one is at all > > valid. > > > > This particular scenario triggers the bug only on two of our Atom based > > platform both with a NM10/Pineview G/i915 chipset. > > Ah ok on Atom you're probably CPU constrained a bit, but still at > 50-60% utilization the kthreads should be running at least sometimes... > > But it sounds like a case of the kthreads not running instead of > queueing too fast anyway (not that the latter is really possible > without some hacking to the flip code). > It may help you here to know that we run both our compositor and the X server at real-time priorities - both are SCHED_RR static priority 1 (the lowest realtime priority). IIRC, the kthreads run at SCHED_OTHER priority, so we are quite capable of starving them during a burst of activity. -- Simon Farnsworth Software Engineer ONELAN Ltd http://www.onelan.com --nextPart8251759.5gnlGLxlcf Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQl6SzAAoJEIKsye9/dtRWZxUH/ArKRb6xJe7LeIgLOviGBCB8 22h8GnIHRIkjH+Hi61xyKfAfaapwpWQeVMnBTV5AgYC3uWnfP/0uqGBMs+Jtukop m0X3GPISsilrPZhDZdY4nlMjf8oN1CKOZPzammvjD/4Oslqp26pnpvUxAxx7mUet /Th8J2gbvwzSRETz0iPPRolsDKnZYzscE7NBGNgWMWklsfzaGzQ0YdzaeJn//riP xs/WsR1YfPStoQTDmu3STA/BO2P2iOxqk/DeQtNg6l/E+3PV4PrayTRW0neneva5 65hD7xd4PwXTxMu9zRk6GB6BqcmDXES2Cqb2Y1GbDY+r1B3sTYMUlBAia0iuTTY= =uniQ -----END PGP SIGNATURE----- --nextPart8251759.5gnlGLxlcf-- --===============0752560625== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx --===============0752560625==--