From: Thomas Hellstrom <thellstrom@vmware.com>
To: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: buggy/weird behavior in ttm
Date: Thu, 11 Oct 2012 18:57:28 +0200 [thread overview]
Message-ID: <5076FA78.9090507@vmware.com> (raw)
In-Reply-To: <5076DCAD.6070606@canonical.com>
On 10/11/2012 04:50 PM, Maarten Lankhorst wrote:
> I was trying to clean ttm up a little so my changes would be less invasive, and simplify
> the code for debuggability. During testing I noticed the following weirdnesses:
> - ttm_mem_evict_first ignores no_wait_gpu if the buffer is on the ddestroy list.
> If you follow the code, it will effectively spin in ttm_mem_evict_first if a bo
> is on the list and no_wait_gpu is true.
Yes, you're right. This is a bug. At a first glance it looks like the
code should
return unconditionally after kref_put().
>
> I was working on a commit that removes fence_lock since I was killing off the
> fence lock, but that requires some kind of defined behavior for this. Unless
> we leave this in place as expected behavior..
I don't quite follow you? If you mean defined behavior for the fence
lock it is
that it protects the bo::sync_obj and bo::sync_obj_arg of *all* buffers.
It was
prevously one lock per buffer. The locking order is that it should be
taken before
the lru lock, but looking at the code it seems it could be quite
simplified the other
way around...
Anyway, if you plan to remove the fence lock and protect it with
reserve, you must
make sure that a waiting reserve is never done in a destruction path. I
think this
mostly concerns the nvidia driver.
>
> - no_wait_reserve is ignored if no_wait_gpu is false
> ttm_bo_reserve_locked can only return true if no_wait_reserve is true, but
> subsequently it will do a wait_unreserved if no_wait_gpu is false.
> I'm planning on removing this argument and act like it is always true, since
> nothing on the lru list should fail to reserve currently.
Yes, since all buffers that are reserved are removed from the LRU list,
there
should never be a waiting reserve on them, so no_wait_reserve can be removed
from ttm_mem_evict_first, ttm_bo_evict and possibly other functions in
the call chain.
>
> - effectively unlimited callchain between some functions that all go through
> ttm_mem_evict_first:
>
> /------------------------\
> ttm_mem_evict_first - ttm_bo_evict - -ttm_bo_mem_space - ttm_bo_mem_force_space - ttm_mem_evict_first
> \ ttm_bo_handle_move_mem /
> I'm not surprised that there was a deadlock before, it seems to me it would
> be pretty suicidal to ever do a blocking reserve on any of those lists,
> lockdep would be all over you for this.
Well, at first this may look worse than it actually is. The driver's
eviction memory order determines the recursion depth
and typically it's 0 or 1, since subsequent ttm_mem_evict_first should
never touch the same LRU lists as the first one.
What would typically happen is that a BO is evicted from VRAM to TT, and
if there is no space in TT, another BO is evicted
to system memory, and the chain is terminated. However a driver could
set up any eviction order but that would be
a BUG.
But in essence, as you say, even with a small recursion depth, a waiting
reserve could cause a deadlock.
But there should be no waiting reserves in the eviction path currently.
>
> Also it seems ttm_bo_move_ttm, ttm_bo_move_memcpy and ttm_bo_move_accel_cleanup
> don't use some of their arguments, so could those be dropped?
Yes, but they are designed so that they could be plugged into the
driver::move interface, so
if you change the argument list you should do it on driver::move as
well, and
make sure they all have the same argument list.
> ~Maarten
>
/Thomas
next prev parent reply other threads:[~2012-10-11 16:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-11 14:50 buggy/weird behavior in ttm Maarten Lankhorst
2012-10-11 16:57 ` Thomas Hellstrom [this message]
2012-10-11 18:42 ` Maarten Lankhorst
2012-10-11 19:26 ` Thomas Hellstrom
2012-10-11 19:35 ` Thomas Hellstrom
2012-10-11 20:55 ` Maarten Lankhorst
2012-10-12 5:57 ` Thomas Hellstrom
2012-10-12 7:49 ` Maarten Lankhorst
2012-10-15 12:14 ` Thomas Hellstrom
2012-10-12 10:09 ` Maarten Lankhorst
2012-10-15 12:27 ` Thomas Hellstrom
2012-10-15 15:37 ` Maarten Lankhorst
2012-10-15 18:34 ` Maarten Lankhorst
2012-10-15 19:44 ` Jerome Glisse
2012-10-15 18:40 ` Thomas Hellstrom
2012-10-15 19:32 ` Maarten Lankhorst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5076FA78.9090507@vmware.com \
--to=thellstrom@vmware.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=maarten.lankhorst@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.