From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Hellstrom Subject: Re: Memory eviction in ttm Date: Mon, 17 Sep 2012 23:35:28 +0200 Message-ID: <505797A0.4080206@vmware.com> References: <50508493.6060304@canonical.com> <50508E04.4090702@vmware.com> <5051F985.8020804@canonical.com> <50520CB9.8000607@vmware.com> <50521433.5070708@canonical.com> <50524966.5080805@vmware.com> <5052D8ED.7000206@canonical.com> <50530BE7.90700@vmware.com> <505332F7.9000101@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from smtp-outbound-1.vmware.com (smtp-outbound-1.vmware.com [208.91.2.12]) by gabe.freedesktop.org (Postfix) with ESMTP id 28AF19EC5D for ; Mon, 17 Sep 2012 14:35:33 -0700 (PDT) In-Reply-To: <505332F7.9000101@canonical.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: Maarten Lankhorst Cc: linaro-mm-sig@lists.linaro.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org On 09/14/2012 03:36 PM, Maarten Lankhorst wrote: > Op 14-09-12 12:50, Thomas Hellstr=F6m schreef: >> Hi Maarten! >> >> Broadening the audience a bit.. >> >> On 9/14/12 9:12 AM, Maarten Lankhorst wrote: >>> Op 13-09-12 23:00, Thomas Hellstrom schreef: >>>> On 09/13/2012 07:13 PM, Maarten Lankhorst wrote: >>>>> Hey >>>>> >>>>> Op 13-09-12 18:41, Thomas Hellstrom schreef: >>>>>> On 09/13/2012 05:19 PM, Maarten Lankhorst wrote: >>>>>>> Hey, >>>>>>> >>>>>>> Op 12-09-12 15:28, Thomas Hellstrom schreef: >>>>>>>> On 09/12/2012 02:48 PM, Maarten Lankhorst wrote: >>>>>>>>> Hey Thomas, >>>>>>>>> >>>>>>>>> I'm playing around with moving reservations from ttm to global, b= ut how ttm >>>>>>>>> ttm is handling reservations is getting in the way. The code wan= ts to move >>>>>>>>> the bo from the lru lock at the same time a reservation is made, = but that >>>>>>>>> seems to be slightly too strict. It would really help me if that = guarantee >>>>>>>>> is removed. >>>>>>>> Hi, Maarten. >>>>>>>> >>>>>>>> Removing that restriction is not really possible at the moment. >>>>>>>> Also the memory accounting code depends on this, and may cause res= ervations >>>>>>>> in the most awkward places. Since these reservations don't have a = ticket >>>>>>>> they may and will cause deadlocks. So in short the restriction is = there >>>>>>>> to avoid deadlocks caused by ticketless reservations. >>>>>>> I have finished the lockdep annotations now which seems to catch al= most >>>>>>> all abuse I threw at it, so I'm feeling slightly more confident abo= ut moving >>>>>>> the locking order and reservations around. >>>>>> Maarten, moving reservations in TTM out of the lru lock is incorrect= as the code is >>>>>> written now. If we want to move it out we need something for ticketl= ess reservations >>>>>> >>>>>> I've been thinking of having a global hash table of tickets with the= task struct pointer as the key, >>>>>> but even then, we'd need to be able to handle EBUSY errors on every = operation that might try to >>>>>> reserve a buffer. >>>>>> >>>>>> The fact that lockdep doesn't complain isn't enough. There *will* be= deadlock use-cases when TTM is handed >>>>>> the right data-set. >>>>>> >>>>>> Isn't there a way that a subsystem can register a callback to be per= formed to remove stuff from LRU and >>>>>> to take a pre-reservation lock? >>>>> What if multiple subsystems need those? You will end up with a deadlo= ck again. >>>>> >>>>> I think it would be easier to change the code in ttm_bo.c to not assu= me the first >>>>> item on the lru list is really the least recently used, and assume th= e first item >>>>> that can be reserved without blocking IS the least recently used inst= ead. >>>> So what would happen then is that we'd spin on the first item on the L= RU list, since >>>> when reserving we must release the LRU lock, and if reserving fails, w= e thus >>>> need to restart LRU traversal. Typically after a schedule(). That's ba= d. >>>> >>>> So let's take a step back and analyze why the LRU lock has become a pr= oblem. >>>> From what I can tell, it's because you want to use per-object lock w= hen reserving instead of a >>>> global reservation lock (that TTM could use as the LRU lock). Is that = correct? >>>> and in that case, in what situation do you envision such a global lock= being contended >>>> to the extent that it hurts performance? >>>> >>>>>>> Lockdep WILL complain about trying to use multiple tickets, doing t= icketed >>>>>>> and unticketed blocking reservations mixed, etc. >>>>>>> >>>>>>> I want to remove the global fence_lock and make it a per buffer loc= k, with some >>>>>>> lockdep annotations it's perfectly legal to grab obj->fence_lock an= d obj2->fence_lock >>>>>>> if you have a reservation, but it should complain loudly about tryi= ng to take 2 fence_locks >>>>>>> at the same time without a reservation. >>>>>> Yes, TTM was previously using per buffer fence locks, and that works= fine from a deadlock perspective, but >>>>>> it hurts performance. Fencing 200 buffers in a command submission (g= oogle-earth for example) will mean >>>>>> 198 unnecessary locks, each discarding the processor pipelines. Lock= ing is a *slow* operation, particularly >>>>>> on systems with many processors, and I don't think it's a good idea = to change that back, without analyzing >>>>>> the performance impact. There are reasons people are writing stuff l= ike RCU to avoid locking... >>>>> So why don't we simply use RCU for fence pointers and get rid of the = fence locking? :D >>>>> danvet originally suggested it as a joke but if you think about it, i= t would make a lot of sense for this usecase. >>>> I thought of that before, but the problem is you'd still need a spinlo= ck to change the buffer's fence pointer, >>>> even if reading it becomes quick. >>> Actually, I changed lockdep annotations a bit to distinguish between the >>> cases where ttm_bo_wait is called without reservation, and ttm_bo_wait >>> is called with, as far as I can see there are only 2 places that do it = without, >>> at least if I converted my git tree properly.. >>> >>> http://cgit.freedesktop.org/~mlankhorst/linux/log/?h=3Dv10-wip >>> >>> First one is nouveau_bo_vma_del, this can be fixed easily. >>> Second one is ttm_bo_cleanup_refs and ttm_bo_cleanup_refs_or_queue, >>> if reservation is done first before ttm_bo_wait, the fence_lock could be >>> dropped entirely by adding smb_mb() in reserve and unreserve, functiona= lly >>> there would be no difference. So if you can verify my lockdep annotatio= ns are >>> correct in the most recent commit wrt what's using ttm_bo_wait without = reservation >>> we could remove the fence_lock entirely. >>> >>> ~Maarten >> Being able to wait for buffer idle or get the fence pointer without rese= rving is a fundamental property of TTM. Reservation is a long-term lock. Th= e fence lock is a very short term lock. If I were to choose, I'd rather acc= ept per-object fence locks than removing this property, but see below. > I was trying to say the ONLY place where this is used in ttm is in ttm_bo= _cleanup_refs(or *_or_queue). > No driver depends on this, they all take a reservation first. As such fen= ce_lock can be removed if you > change the order of the reservation and waiting around, making it behave = the same as the rest of ttm. > http://cgit.freedeskt > op.org/~mlankhorst/linux/commit/?h=3Dv10-wip&id=3Db879ac51810408809d6424d= 26da4013171d1bc15 > fixes up those 2 calls, then removes the fence_lock altogether. In the nouveau case the function nouveau_bo_vma_del seems to be called = from different places both with reservation and without reservation, and while it might look simple to add a reserve around a wait for idle, = it's a design decision not to require it, because in some situations = it's just not a good thing to do. >> Likewise, to be able to guarantee that a reserved object is not on any L= RU list is also an important property. Removing that property will, in addi= tion to the spin wait we've already discussed make understanding TTM lockin= g even more difficult, and I'd really like to avoid it. > This guarantee will no longer be true if reservations for buffer objects = are going to be shared > across devices, other users of the reservations may have their own requir= ements. > > Why is this property important though? If it could be changed to 'likely = unreserved' > instead of guaranteed not on the lru list the few places that rely on thi= s could be > fixed to attempt a nonblocking reserve first. Please see above what I write about spinning around the first item and list traversal restart. > > Is it just ttm_mem_evict_first / ttm_bo_swapout / ttm_bo_force_list_clean > that would break down? If so those could do a simple iteration over the l= ist > to find the first unreserved buffer instead, or do the default thing thos= e functions > do when the list is empty if they traversed the full list without findin= g anything > that could be reserved. TTM guarantees if desired, each user-space client full access to *all* GPU memory resources, so that a user-space client reliably knows when to flush. If it fails buffer validation in the first execbuf pass, it may take an = exclusive execbuf lock, evict *all* buffers and retry. Then *all* buffers must be evicted. = We can't leave any buffers around that are reserved but not taken off LRU lists. Also, at final buffer destruction you must be able to guarantee that you = have the buffer reserved and that you are the only one holding a reference. That's = pretty tricky to do if you don't take the buffer off LRU lists atomically when reserving.... > >> If this were a real performance problem we were trying to solve it would= be easier to motivate changes in this area, but if it's just trying to avo= id a global reservation lock and a global fence lock that will rarely if ev= er see any contention, I can't see the point. On the contrary, having per-o= bject locks will be very costly when reserving / fencing many objects. As m= entioned before, in the fence lock case it's been tried and removed, so I'd= like to know the reasoning behind introducing it again, and in what situat= ions you think the global locks will be contended. > Because I'm trying to keep things as simple as possible, and this whole r= eservation business is already giving me a headache > so I would really like to avoid any special lock that interacts with rese= rvations in a complex way and the easiest way would be > to instead accept that there reservation and your lru removal call may no= t be atomic. I hope to eventually support eviction though, > so some notification that the buffer was reserved succesfully is going to= be likely, I just don't think it needs to be done atomically. > Then may I humbly suggest that you put a global spinlock around = reserving, and a list of callbacks to be called while still locked after = reserving. Any subsystem wanting to pull stuff off LRU lists can then register a = callback. That's no complex interaction. /Thomas