Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marco Elver <elver@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: quic_charante@quicinc.com, akpm@linux-foundation.org,
	aneesh.kumar@linux.ibm.com, dan.j.williams@intel.com,
	david@redhat.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mgorman@techsingularity.net,
	osalvador@suse.de, vbabka@suse.cz,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	kasan-dev@googlegroups.com, Ilya Leoshkevich <iii@linux.ibm.com>,
	Nicholas Miehlbradt <nicholas@linux.ibm.com>,
	rcu@vger.kernel.org
Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage
Date: Wed, 17 Jan 2024 20:18:21 +0100	[thread overview]
Message-ID: <Zagn_T44RU94dZa7@elver.google.com> (raw)
In-Reply-To: <CANpmjNMP802yN0i6puHHKX5E1PZ_6_h1x9nkGHCXZ4DVabxy7A@mail.gmail.com>

On Mon, Jan 15, 2024 at 09:34PM +0100, Marco Elver wrote:
> On Mon, 15 Jan 2024 at 19:44, Alexander Potapenko <glider@google.com> wrote:
> >
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marco Elver <elver@google.com>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: kasan-dev@googlegroups.com
> > Cc: Ilya Leoshkevich <iii@linux.ibm.com>
> > Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com>
> >
> > Hi folks,
> >
> > (adding KMSAN reviewers and IBM people who are currently porting KMSAN to other
> > architectures, plus Paul for his opinion on refactoring RCU)
> >
> > this patch broke x86 KMSAN in a subtle way.
> >
> > For every memory access in the code instrumented by KMSAN we call
> > kmsan_get_metadata() to obtain the metadata for the memory being accessed. For
> > virtual memory the metadata pointers are stored in the corresponding `struct
> > page`, therefore we need to call virt_to_page() to get them.
> >
> > According to the comment in arch/x86/include/asm/page.h, virt_to_page(kaddr)
> > returns a valid pointer iff virt_addr_valid(kaddr) is true, so KMSAN needs to
> > call virt_addr_valid() as well.
> >
> > To avoid recursion, kmsan_get_metadata() must not call instrumented code,
> > therefore ./arch/x86/include/asm/kmsan.h forks parts of arch/x86/mm/physaddr.c
> > to check whether a virtual address is valid or not.
> >
> > But the introduction of rcu_read_lock() to pfn_valid() added instrumented RCU
> > API calls to virt_to_page_or_null(), which is called by kmsan_get_metadata(),
> > so there is an infinite recursion now. I do not think it is correct to stop that
> > recursion by doing kmsan_enter_runtime()/kmsan_exit_runtime() in
> > kmsan_get_metadata(): that would prevent instrumented functions called from
> > within the runtime from tracking the shadow values, which might introduce false
> > positives.
> >
> > I am currently looking into inlining __rcu_read_lock()/__rcu_read_unlock(), into
> > KMSAN code to prevent it from being instrumented, but that might require factoring
> > out parts of kernel/rcu/tree_plugin.h into a non-private header. Do you think this
> > is feasible?
> 
> __rcu_read_lock/unlock() is only outlined in PREEMPT_RCU. Not sure that helps.
> 
> Otherwise, there is rcu_read_lock_sched_notrace() which does the bare
> minimum and is static inline.
> 
> Does that help?

Hrm, rcu_read_unlock_sched_notrace() can still call
__preempt_schedule_notrace(), which is again instrumented by KMSAN.

This patch gets me a working kernel:

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 4ed33b127821..2d62df462d88 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -2000,6 +2000,7 @@ static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
 	int ret;
+	unsigned long flags;
 
 	/*
 	 * Ensure the upper PAGE_SHIFT bits are clear in the
@@ -2013,9 +2014,9 @@ static inline int pfn_valid(unsigned long pfn)
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __pfn_to_section(pfn);
-	rcu_read_lock();
+	local_irq_save(flags);
 	if (!valid_section(ms)) {
-		rcu_read_unlock();
+		local_irq_restore(flags);
 		return 0;
 	}
 	/*
@@ -2023,7 +2024,7 @@ static inline int pfn_valid(unsigned long pfn)
 	 * the entire section-sized span.
 	 */
 	ret = early_section(ms) || pfn_section_valid(ms, pfn);
-	rcu_read_unlock();
+	local_irq_restore(flags);
 
 	return ret;
 }

Disabling interrupts is a little heavy handed - it also assumes the
current RCU implementation. There is
preempt_enable_no_resched_notrace(), but that might be worse because it
breaks scheduling guarantees.

That being said, whatever we do here should be wrapped in some
rcu_read_lock/unlock_<newvariant>() helper.

Is there an existing helper we can use? If not, we need a variant that
can be used from extremely constrained contexts that can't even call
into the scheduler. And if we want pfn_valid() to switch to it, it also
should be fast.

Thanks,
-- Marco

next prev parent reply	other threads:[~2024-01-17 19:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-13 13:04 [PATCH] mm/sparsemem: fix race in accessing memory_section->usage Charan Teja Kalla
2023-10-14 22:25 ` Andrew Morton
2023-10-16  8:23   ` David Hildenbrand
2023-10-16 13:38     ` Charan Teja Kalla
2023-10-16 22:34       ` Andrew Morton
2023-10-18  7:52       ` David Hildenbrand
2023-10-16 10:33 ` Pavan Kondeti
2023-10-17 14:10   ` Charan Teja Kalla
2023-10-17 14:53     ` David Hildenbrand
2023-10-25 21:35     ` Andrew Morton
2023-10-26  7:00       ` David Hildenbrand
2023-10-26  7:18         ` Charan Teja Kalla
2024-01-15 18:44 ` Alexander Potapenko
2024-01-15 20:34   ` Marco Elver
2024-01-17 19:18     ` Marco Elver [this message]
2024-01-18  9:01       ` Alexander Potapenko
2024-01-18  9:43         ` Marco Elver
2024-01-25 13:20           ` Paul E. McKenney

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:4ed33b12782 dfblob:2d62df462d8 )
 OR (
bs:"Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zagn_T44RU94dZa7@elver.google.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=iii@linux.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=nicholas@linux.ibm.com \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=quic_charante@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.