From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-cve-announce@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@kernel.org>
Subject: CVE-2026-43404: mm: Fix a hmm_range_fault() livelock / starvation problem
Date: Fri, 8 May 2026 16:22:13 +0200 [thread overview]
Message-ID: <2026050841-CVE-2026-43404-4ddb@gregkh> (raw)
From: Greg Kroah-Hartman <gregkh@kernel.org>
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
mm: Fix a hmm_range_fault() livelock / starvation problem
If hmm_range_fault() fails a folio_trylock() in do_swap_page,
trying to acquire the lock of a device-private folio for migration,
to ram, the function will spin until it succeeds grabbing the lock.
However, if the process holding the lock is depending on a work
item to be completed, which is scheduled on the same CPU as the
spinning hmm_range_fault(), that work item might be starved and
we end up in a livelock / starvation situation which is never
resolved.
This can happen, for example if the process holding the
device-private folio lock is stuck in
migrate_device_unmap()->lru_add_drain_all()
sinc lru_add_drain_all() requires a short work-item
to be run on all online cpus to complete.
A prerequisite for this to happen is:
a) Both zone device and system memory folios are considered in
migrate_device_unmap(), so that there is a reason to call
lru_add_drain_all() for a system memory folio while a
folio lock is held on a zone device folio.
b) The zone device folio has an initial mapcount > 1 which causes
at least one migration PTE entry insertion to be deferred to
try_to_migrate(), which can happen after the call to
lru_add_drain_all().
c) No or voluntary only preemption.
This all seems pretty unlikely to happen, but indeed is hit by
the "xe_exec_system_allocator" igt test.
Resolve this by waiting for the folio to be unlocked if the
folio_trylock() fails in do_swap_page().
Rename migration_entry_wait_on_locked() to
softleaf_entry_wait_unlock() and update its documentation to
indicate the new use-case.
Future code improvements might consider moving
the lru_add_drain_all() call in migrate_device_unmap() to be
called *after* all pages have migration entries inserted.
That would eliminate also b) above.
v2:
- Instead of a cond_resched() in hmm_range_fault(),
eliminate the problem by waiting for the folio to be unlocked
in do_swap_page() (Alistair Popple, Andrew Morton)
v3:
- Add a stub migration_entry_wait_on_locked() for the
!CONFIG_MIGRATION case. (Kernel Test Robot)
v4:
- Rename migrate_entry_wait_on_locked() to
softleaf_entry_wait_on_locked() and update docs (Alistair Popple)
v5:
- Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION
version of softleaf_entry_wait_on_locked().
- Modify wording around function names in the commit message
(Andrew Morton)
(cherry picked from commit a69d1ab971a624c6f112cea61536569d579c3215)
The Linux kernel CVE team has assigned CVE-2026-43404 to this issue.
Affected and fixed versions
===========================
Issue introduced in 6.15 with commit 1afaeb8293c9addbf4f9140bdd22635fed763459 and fixed in 6.18.19 with commit 94b6d0ba4b640ba23bb6c708a59316e74e5ede63
Issue introduced in 6.15 with commit 1afaeb8293c9addbf4f9140bdd22635fed763459 and fixed in 6.19.9 with commit 7e6e2fc91d4b9b12ec6e137019532568ebcf2680
Issue introduced in 6.15 with commit 1afaeb8293c9addbf4f9140bdd22635fed763459 and fixed in 7.0 with commit b570f37a2ce480be26c665345c5514686a8a0274
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2026-43404
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
include/linux/migrate.h
mm/filemap.c
mm/memory.c
mm/migrate.c
mm/migrate_device.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/94b6d0ba4b640ba23bb6c708a59316e74e5ede63
https://git.kernel.org/stable/c/7e6e2fc91d4b9b12ec6e137019532568ebcf2680
https://git.kernel.org/stable/c/b570f37a2ce480be26c665345c5514686a8a0274
reply other threads:[~2026-05-08 14:24 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2026050841-CVE-2026-43404-4ddb@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=cve@kernel.org \
--cc=gregkh@kernel.org \
--cc=linux-cve-announce@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox