From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Widawsky Subject: Re: [PATCH] [RFC] drm/i915: read-read semaphore optimization Date: Mon, 16 Jan 2012 14:20:55 -0800 Message-ID: <4F14A2C7.7020006@bwidawsk.net> References: <1323748328-10153-1-git-send-email-ben@bwidawsk.net> <87fwgoidza.fsf@eliezer.anholt.net> <4EE79B1F.2000707@bwidawsk.net> <20120116215052.GO3627@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from cloud01.chad-versace.us (184-106-247-128.static.cloud-ips.com [184.106.247.128]) by gabe.freedesktop.org (Postfix) with ESMTP id DD4009ED55 for ; Mon, 16 Jan 2012 14:21:03 -0800 (PST) In-Reply-To: <20120116215052.GO3627@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Vetter Cc: Daniel Vetter , intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On 01/16/2012 01:50 PM, Daniel Vetter wrote: > On Tue, Dec 13, 2011 at 10:36:15AM -0800, Ben Widawsky wrote: >> On 12/13/2011 09:22 AM, Eric Anholt wrote: >>> On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky wrote: >>>> Since we don't differentiate on the different GPU read domains, it >>>> should be safe to allow back to back reads to occur without issuing a >>>> wait (or flush in the non-semaphore case). >>>> >>>> This has the unfortunate side effect that we need to keep track of all >>>> the outstanding buffer reads so that we can synchronize on a write, to >>>> another ring (since we don't know which read finishes first). In other >>>> words, the code is quite simple for two rings, but gets more tricky for >>>>> 2 rings. >>>> >>>> Here is a picture of the solution to the above problem >>>> >>>> Ring 0 Ring 1 Ring 2 >>>> batch 0 batch 1 batch 2 >>>> read buffer A read buffer A wait batch 0 >>>> wait batch 1 >>>> write buffer A >>>> >>>> This code is really untested. I'm hoping for some feedback if this is >>>> worth cleaning up, and testing more thoroughly. >>> >>> You say it's an optimization -- do you have performance numbers? >> >> 33% improvement on a hacked version of gem_ring_sync_loop with. >> >> It's not really a valid test as it's not coherent, but this is >> approximately the best case improvement. >> >> Oddly semaphores doesn't make much difference in this test, which >> was surprising. > > Our domain tracking is already complicated in unfunny ways. And (at least > without a use-case showing gains with hard numbers in either perf or power > usage) I think this patch is the kind of "this looks cool" stuff that > added a lot to the current problem. > > So before adding more complexity on top I'd like to remove some of the > superflous stuff we already have. I.e. all the flushing_list stuff and > maybe other things ... Can you be more clear on what exactly you want done before taking a patch like this? Maybe I can work on it during some down time. > > Cheers, Daniel ~Ben