From: Catalin Marinas <catalin.marinas@arm.com>
To: Dev Jain <dev.jain@arm.com>
Cc: will@kernel.org, anshuman.khandual@arm.com,
quic_zhenhuah@quicinc.com, ryan.roberts@arm.com,
kevin.brodsky@arm.com, yangyicong@hisilicon.com,
joey.gouly@arm.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, david@redhat.com
Subject: Re: [PATCH v4] arm64: Enable vmalloc-huge with ptdump
Date: Fri, 4 Jul 2025 12:22:06 +0100 [thread overview]
Message-ID: <aGe5XjWKhgjzcw7L@arm.com> (raw)
In-Reply-To: <20250626052524.332-1-dev.jain@arm.com>
On Thu, Jun 26, 2025 at 10:55:24AM +0530, Dev Jain wrote:
> @@ -1301,16 +1314,39 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
> }
>
> table = pmd_offset(pudp, addr);
> + /*
> + * Isolate the PMD table; in case of race with ptdump, this helps
> + * us to avoid taking the lock in __pmd_free_pte_page().
> + *
> + * Static key logic:
> + *
> + * Case 1: If ptdump does static_branch_enable(), and after that we
> + * execute the if block, then this patches in the read lock, ptdump has
> + * the write lock patched in, therefore ptdump will never read from
> + * a potentially freed PMD table.
> + *
> + * Case 2: If the if block starts executing before ptdump's
> + * static_branch_enable(), then no locking synchronization
> + * will be done. However, pud_clear() + the dsb() in
> + * __flush_tlb_kernel_pgtable will ensure that ptdump observes an
Statements like "observes" really need to be relative, not absolute.
Like in "observes an empty PUD before/after ...".
> + * empty PUD. Thus, it will never walk over a potentially freed
> + * PMD table.
> + */
> + pud_clear(pudp);
> + __flush_tlb_kernel_pgtable(addr);
> + if (static_branch_unlikely(&ptdump_lock_key)) {
> + mmap_read_lock(&init_mm);
> + mmap_read_unlock(&init_mm);
> + }
This needs a formal model ;).
static_branch_enable() is called before the mmap_write_lock(), so even
if the above observes the new branch, it may do the read_unlock() before
the ptdump_walk_pgd() attempted the write lock. So your case 1
description is not entirely correct.
I don't get case 2. You want to ensure pud_clear() is observed by the
ptdump code before the pmd_free(). The DSB in the TLB flushing code
ensures some ordering between the pud_clear() and presumably something
that the ptdump code can observe as well. Would that be the mmap
semaphore? However, the read_lock would only be attempted if this code
is seeing the static branch update, which is not guaranteed. I don't
think it even matters since the lock may be released anyway before the
write_lock in ptdump.
For example, you do a pud_clear() above, skip the whole static branch.
The ptdump comes along on another CPU but does not observe the
pud_clear() since there's no synchronisation. It goes ahead and
dereferences it while this CPU does a pmd_free().
And I can't get my head around memory ordering, it doesn't look sound.
static_branch_enable() I don't think has acquire semantics, at least not
in relation to the actual branch update. The static_branch_unlikely()
test, again, not sure what guarantees it has (I don't think it has any
in relation to memory loads). Maybe you have worked it all out and is
fine but it needs a better explanation and ideally some simple formal
model. Show it's correct with a global variable first and then we can
optimise with static branches.
--
Catalin
next prev parent reply other threads:[~2025-07-04 11:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-26 5:25 [PATCH v4] arm64: Enable vmalloc-huge with ptdump Dev Jain
2025-07-04 5:05 ` Dev Jain
2025-07-04 5:58 ` Anshuman Khandual
2025-07-04 6:37 ` Dev Jain
2025-07-04 11:22 ` Catalin Marinas [this message]
2025-07-04 11:42 ` Dev Jain
2025-07-07 0:44 ` Catalin Marinas
2025-07-14 9:26 ` Dev Jain
2025-07-14 15:27 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGe5XjWKhgjzcw7L@arm.com \
--to=catalin.marinas@arm.com \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=joey.gouly@arm.com \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=quic_zhenhuah@quicinc.com \
--cc=ryan.roberts@arm.com \
--cc=will@kernel.org \
--cc=yangyicong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.