From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH] drm/i915: GFDT support for SNB/IVB Date: Mon, 4 Mar 2013 16:03:45 +0200 Message-ID: <20130304140345.GI4469@intel.com> References: <1362162777-21626-1-git-send-email-ville.syrjala@linux.intel.com> <20130303162852.GE9021@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id C09A2E5D21 for ; Mon, 4 Mar 2013 06:03:51 -0800 (PST) Content-Disposition: inline In-Reply-To: <20130303162852.GE9021@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Vetter Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Sun, Mar 03, 2013 at 05:28:52PM +0100, Daniel Vetter wrote: > On Fri, Mar 01, 2013 at 08:32:57PM +0200, ville.syrjala@linux.intel.com w= rote: > > From: Ville Syrj=E4l=E4 > > = > > Currently all scanout buffers must be uncached because the > > display controller doesn't snoop the LLC. SNB introduced another > > method to guarantee coherency for the display controller. It's > > called the GFDT or graphics data type. > > = > > Pages that have the GFDT bit enabled in their PTEs get flushed > > all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is > > issued with the "synchronize GFDT" bit set. > > = > > So rather than making all scanout buffers uncached, set the GFDT > > bit in their PTEs, and modify the ring flush functions to enable > > the "synchronize GFDT" bit. > > = > > On HSW the GFDT bit was removed from the PTE, and it's only present in > > surface state, so we can't really set it from the kernel. Also the docs > > state that the hardware isn't actually guaranteed to respect the GFDT > > bit. So it looks like GFDT isn't all that useful on HSW. > > = > > So far I've tried this very quickly on an IVB machine, and > > it seems to be working as advertised. No idea if it does any > > good though. > > = > > TODO: > > - make sure there are no missing flushes (CPU access doesn't > > respect GFDT after all). > > - measure it and see if there's some real benefit > > - maybe we can track whether "synchronize GFDT" needs to be > > issued, and skip it when possible. needs some numbers to > > determine if it's worthwile. > > = > > Signed-off-by: Ville Syrj=E4l=E4 > = > Iirc when I've tried this out a while back it regressed a few benchmarks. > Chris&me suspected cache trahsing, but hard to tell without proper > instrumentation support. Chris played around with a few tricks to mark > other giant bos as uncacheable, but he couldn't find any improved > workloads. I see. I didn't realize this was tried already. Not that I really planned to implement this in the first place. I was just studying the code and figured I'd learn better by trying to change things a bit ;) But if there's interest I can of course try to improve it further. > In short I think this needs to come with decent amounts of numbers > attached, like the TODO says ;-) > = > The other thing was that I didn't manage to get things to work properly, > leaving some random cachline dirt on the screen. Looking at your code, you > add the gfdt flush to every ring_flush, whereas I've tried to be clever > and only flushed after batches rendering to the frontbuffer. So probably a > bug in my code, or a flush on a given ring doesn't flush out caches for > one of the other engines. I had a bug in my first attempt where I forgot the initial clflush. That left some random crap on the screen until I rendered the next frame w/ GFDT already enabled. Other than that I didn't see any corruption except when I intentionally left out the flushes. But as stated my code does too many flushes probably. -- = Ville Syrj=E4l=E4 Intel OTC