From: "Arsen Arsenović" <aarsenovic@baylibre.com>
To: Alistair Popple <apopple@nvidia.com>
Cc: amd-gfx@lists.freedesktop.org, linux-mm@kvack.org,
cs-tech-ext@baylibre.com
Subject: Re: [BUG] Frequent hangs or WARNINGs when using heterogeneous memory with an AMD MI210 GPU
Date: Fri, 01 May 2026 16:25:53 +0200 [thread overview]
Message-ID: <86340bp5ou.fsf@baylibre.com> (raw)
In-Reply-To: <afREaF6hcka_cxnY@nvdebian.thelocal>
[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]
Alistair Popple <apopple@nvidia.com> writes:
> I don't know the AMD driver well enough to comment definitively but
> chances are this warning is spurious. I have been meaning to put
> togeather a fix for it. The problem is that migrate_vma_setup()
> etc. allow for migration of anonymous folios, which is subtly
> different from only allowing migration of anonymous VMA's.
>
> Specifically migrate_vma checks for folio_test_anon() which returns
> true for private file-backed VMAs while the warning is based on
> vma_is_anonymous() which is false for such mappings. So it is possible
> for the driver to migrate a private filebacked mapping to GPU memory
> which will trigger this warning during teardown if the page wasn't
> migrated back.
Ah, if it is spurious, that is quite unfortunate. We were hoping it's
the same issue as the one the rest of the email was describing (those
hangs, unkillable processes, and bad page states), since that means we
have a good reproducer for it.
FWIW, that sounds like a plausible explanation; the program is using
dynamic_cast, so typeinfo will need to be accessed. The typeinfo is
mmap-ped from the executable, so it's file-backed. I don't see any
reason for this page to be thrown out of the GPU later, so it stays
mapped until exit, and causes the warning.
The trigger for the latter is significantly harder to reproduce, and far
less self-contained.
So, I suppose we're left with a bug for which the reproducer "run more
than nproc of parallel AMDGPU&HMM-utilizing processes in a loop and
cross fingers". :/
Thank you very much for fixing the WARN_ON!
Have a lovely day.
--
Arsen Arsenović
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 430 bytes --]
prev parent reply other threads:[~2026-05-01 14:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:10 [BUG] Frequent hangs or WARNINGs when using heterogeneous memory with an AMD MI210 GPU Arsen Arsenović
2026-04-29 12:47 ` Arsen Arsenović
2026-05-01 6:21 ` Alistair Popple
2026-05-01 14:25 ` Arsen Arsenović [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86340bp5ou.fsf@baylibre.com \
--to=aarsenovic@baylibre.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=cs-tech-ext@baylibre.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox