Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	matthew.brost@intel.com, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org
Subject: Re: Switching over to GEM refcounts and a bunch of cleanups
Date: Thu, 21 Aug 2025 16:59:56 +0200	[thread overview]
Message-ID: <d3f60146-39d3-458f-8271-cfcec1c92254@amd.com> (raw)
In-Reply-To: <fd34b897a3223346518d3fe009772996eb25c90b.camel@linux.intel.com>

On 21.08.25 16:06, Thomas Hellström wrote:
>> What are you referring to?
> 
> https://lore.kernel.org/intel-xe/a004736315d77837172418eb196d5b5f80b74e6c.camel@linux.intel.com/

Thanks, that one never made it into my inbox as far as I can see.

> A couple of questions on the design direction here:
> 
> IIRC both xe and i915 has checks to consider objects with a 0 gem
> refcount as zombies requiring special treatment or skipping, when
> encountered in TTM callbacks. We need to double-check that.

I think I've found all of those. The one in i915 were actually not TTM specific but try to catch the same problem on the GEM refcount.

> But I wonder, 
> first this practice of resurrecting refcounts seem a bit unusual, I
> wonder if we can get rid of that somehow?

I was also going back on forth if that is a good idea or not as well.

The usual solution to such kinds of issues is to use two reference counts, so that you got a multi stage cleanup approach. E.g. backing store and object, like what mm_struct is using as well.

The problem was simply that TTM/GEM ended up having *four* reference counts for the same object, each was doing something different and they didn't worked well together at all.

> Furthermore, it seems the problem with drm_exec is related only to the
> LRU walk. What about adding a struct completion to the object, that is
> signaled when the object has freed its final backing-store. The LRU
> walk would then check if the object is a zombie, and if so just wait on
> the struct completion. (Need of course to carefully set up locking
> orders). Then we wouldn't need to resurrect the gem refcount, nor use
> drm_exec locking for zombies.

I had a similar idea, waiting is already possible by waiting for the BOs work item.

But I abandoned that idea because I couldn't see how we could solve the locking.

> We would still need some form of refcounting while waiting on the
> struct completion, but if we restricted the TTM refcount to *only* be
> used internally for that sole purpose, and also replaced the final
> ttm_bo_put() with the ttm_bo_finalize() that you suggest we wouldn't
> need to resurrect that refcount since it wouldn't drop to zero until
> the object is ready for final free.
> 
> Ideas, comments?

Ideally I think we would use the handle_count as backing store the drm_gem_object->refcount as structure reference.

But that means a massive rework of the GEM handling/drivers/TTM.

Alternative we could just grab a reference to a unsignaled fence when we encounter a dead BO on the LRU.

What do you think of that idea?

Regards,
Christian.

  reply	other threads:[~2025-08-21 15:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
2025-07-16 16:04 ` [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3 Christian König
2025-07-16 16:04 ` [PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops Christian König
2025-07-16 16:04 ` [PATCH 3/7] drm/ttm: grab BO reference before locking it Christian König
2025-07-16 16:04 ` [PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout Christian König
2025-07-16 16:04 ` [PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict Christian König
2025-07-16 16:04 ` [PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all Christian König
2025-07-16 16:04 ` [PATCH 7/7] drm/xe: remove workaround for TTM internals Christian König
2025-07-22 12:36 ` Switching over to GEM refcounts and a bunch of cleanups Thomas Hellström
2025-08-04 12:24   ` Christian König
2025-08-21 12:30     ` Christian König
2025-08-21 14:06       ` Thomas Hellström
2025-08-21 14:59         ` Christian König [this message]
2025-08-22  8:51           ` Thomas Hellström
2025-09-08 15:16             ` Christian König
2025-09-09  7:21               ` Thomas Hellström
2025-09-09  8:13                 ` Christian König
2025-09-09  8:36                   ` Thomas Hellström

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3f60146-39d3-458f-8271-cfcec1c92254@amd.com \
    --to=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox