The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kas@kernel.org>
To: "Denis V. Lunev" <den@openvz.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	 Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org,  Thomas Gleixner <tglx@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 "H. Peter Anvin" <hpa@zytor.com>,
	"Mike Rapoport (Microsoft)" <rppt@kernel.org>,
	 Juergen Gross <jgross@suse.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86/mm/pat: take cpa_lock around large-page collapse
Date: Fri, 3 Jul 2026 14:01:35 +0100	[thread overview]
Message-ID: <akeycc1jNMi-niIs@thinkstation> (raw)
In-Reply-To: <20260626163213.2284080-1-den@openvz.org>

On Fri, Jun 26, 2026 at 06:32:11PM +0200, Denis V. Lunev wrote:
> Loading and unloading modules concurrently on several CPUs on a KASAN
> build, with a short delay injected at the CPA page-table lookup to
> widen the window, faults within minutes:
> 
>   BUG: KASAN: use-after-free in __change_page_attr+0x7cc/0x7e0
>   Write of size 8 at addr ffff888181139718 by task modprobe
>   ...
>   The buggy address belongs to the physical page:
>    pfn:0x181139 ... page_type: f2(table)
> 
> cpa_collapse_large_pages() rebuilds a leaf PMD from its 4K PTEs and
> frees the old PTE-table pages, while __change_page_attr() fetches a
> PTE pointer from a lockless lookup_address_in_pgd_attr() and writes
> it with set_pte_atomic() only later. When module text is served from
> a shared large ROX mapping the two run on the same PMD:
> 
>   CPU A (module load)              CPU B (module finalize)
>   -------------------              -----------------------
>   execmem_make_temp_rw
>    set_memory_nx
>     __change_page_attr
>      split 2M -> 4K table P
>      kpte = &P[i]  (lockless)
>                                    execmem_restore_rox
>                                     set_memory_rox (CPA_COLLAPSE)
>                                      cpa_collapse_large_pages
>                                       rebuild leaf PMD
>                                       flush_tlb_all
>                                       pagetable_free(P)
>      set_pte_atomic(kpte, ...)
>        -> writes into freed P
> 
> P is a page-table page (page_type: table), reused at once, so the
> write corrupts whatever got the page next: a bad-pte or bad-page
> splat, or a fatal fault once P has been turned into read-only text.
> 
> The flush_tlb_all() before the free does not close this: its IPI only
> serializes against page-table walkers that run with interrupts off
> (e.g. GUP-fast); the walk in __change_page_attr() runs with interrupts
> on, so nothing stops it from holding a stale pointer into P.
> 
> Serialize the collapse - the PMD rebuild, TLB flush and PTE-table
> free - under cpa_lock, the lock __change_page_attr() takes for the
> split path, so a concurrent walker can no longer hold a pointer into
> a table the collapse is about to free.
> 
> debug_pagealloc bypasses cpa_lock in __change_page_attr() (the direct
> map is 4K then, with no large pages to serialize), so the lock cannot
> order the two there. Skip the collapse in that config: it is only an
> optimization, and not freeing the tables leaves the unserialized walk
> nothing to race.
> 
> Fixes: 41d88484c71c ("x86/mm/pat: restore large ROX pages after fragmentation")
> Signed-off-by: Denis V. Lunev <den@openvz.org>

Acked-by: Kiryl Shutsemau (Meta) <kas@kernel.org>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

      parent reply	other threads:[~2026-07-03 13:01 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 16:32 [PATCH] x86/mm/pat: take cpa_lock around large-page collapse Denis V. Lunev
2026-07-02 17:47 ` Denis V. Lunev
2026-07-03 13:01 ` Kiryl Shutsemau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akeycc1jNMi-niIs@thinkstation \
    --to=kas@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=den@openvz.org \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox