From: Jason Gunthorpe <jgg@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>,
Alistair Popple <apopple@nvidia.com>
Cc: Philip Yang <yangp@amd.com>, Philip Yang <Philip.Yang@amd.com>,
amd-gfx@lists.freedesktop.org, Linux MM <linux-mm@kvack.org>,
Leon Romanovsky <leonro@nvidia.com>
Subject: Re: [PATCH] drm/amdkfd: Fix svm_bo and vram page refcount
Date: Mon, 6 Oct 2025 10:21:44 -0300 [thread overview]
Message-ID: <20251006132144.GM3360665@nvidia.com> (raw)
In-Reply-To: <17ee1a4d-69cd-4be9-bd6a-2924e8731db8@amd.com>
On Fri, Oct 03, 2025 at 06:16:14PM -0400, Felix Kuehling wrote:
> It sounds like zone_device_page_init is just unsafe to use in
> general.
It can only be used if you know the page is freed.
> It assumes that pages have a 0 refcount.
Yes
> But I don't see a good way for drivers to guarantee that, because
> they are not in control of when the page refcounts for their
> zone-device pages get decremented.
?? Drivers are supposed to hoook pgmap->ops->page_free() and keep
track.
There is no way to write a driver without calling
zone_device_page_init() as there is no other defined way to re-use a
page that has been returned through page_free().
It is completely wrong to call get_page() on a 0 refcount folio, we
don't have a debugging crash for this, but we really should. If you
think the refcount could be 0 you have to use a try_get().
So this patch looks wrong to me, I see a page_free() implementation
and this is the only call to zone_device_page_init(). If you remove it
the driver is absolutely broken.
I would expect migration should be writing to freed memory and
zone_device_page_init() is the correct and only way to make freed
memory usable again.
Therefore, I expect the refcount to be 0 when
svm_migrate_ram_to_vram() picks a dst.
If it is not true, and you are tring to migrate to already allocated
VRAM, then WTF?
And if you really want to do that then yes you need to use get_page
but you need a different path to handle already allocated vs
page_free() called. get_page() MUST NOT be used to unfree page_free'd
memory.
The explanation in the commit doesn't really have enough detail:
> 1. CPU page fault handler get vram page, migrate the vram page to
> system page
> 2. GPU page fault migrate to the vram page, set page refcount to 1
So why is the same vram page being used for both? For #1 the VRAM page
is installed in a swap entry so it is has an elevated refcount.
The implication is that #2 is targeting already allocated VRAM memory
that is NOT FREE.
Jason
next prev parent reply other threads:[~2025-10-07 7:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 21:03 [PATCH] drm/amdkfd: Fix svm_bo and vram page refcount Philip Yang
2025-09-26 21:38 ` Kasiviswanathan, Harish
2025-09-30 14:38 ` James Zhu
2025-09-30 15:48 ` Mario Limonciello
2025-10-03 21:05 ` Felix Kuehling
2025-10-03 21:18 ` Philip Yang
2025-10-03 21:46 ` Felix Kuehling
2025-10-03 22:02 ` Philip Yang
2025-10-03 22:16 ` Felix Kuehling
2025-10-06 12:55 ` Philip Yang
2025-10-06 13:21 ` Jason Gunthorpe [this message]
2025-10-06 17:51 ` Felix Kuehling
2025-10-06 18:35 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251006132144.GM3360665@nvidia.com \
--to=jgg@nvidia.com \
--cc=Philip.Yang@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=felix.kuehling@amd.com \
--cc=leonro@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=yangp@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.