linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
@ 2025-07-02  8:27 Jeongjun Park
  2025-07-02 15:55 ` Christoph Lameter (Ampere)
  2025-07-02 17:03 ` Shakeel Butt
  0 siblings, 2 replies; 9+ messages in thread
From: Jeongjun Park @ 2025-07-02  8:27 UTC (permalink / raw)
  To: dennis, tj, cl
  Cc: akpm, vbabka, rientjes, linux-mm, linux-kernel,
	syzbot+e5bd32b79413e86f389e, Jeongjun Park

Read/Write to pcpu_nr_populated should be performed while protected
by pcpu_lock. However, pcpu_nr_pages() reads pcpu_nr_populated without any
protection, which causes a data race between read/write.

Therefore, when reading pcpu_nr_populated in pcpu_nr_pages(), it should be
modified to be protected by pcpu_lock.

Reported-by: syzbot+e5bd32b79413e86f389e@syzkaller.appspotmail.com
Fixes: 7e8a6304d541 ("/proc/meminfo: add percpu populated pages count")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
---
 mm/percpu.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index b35494c8ede2..0f98b857fb36 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
  */
 unsigned long pcpu_nr_pages(void)
 {
-	return pcpu_nr_populated * pcpu_nr_units;
+	unsigned long flags, ret;
+
+	spin_lock_irqsave(&pcpu_lock, flags);
+	ret = pcpu_nr_populated * pcpu_nr_units;
+	spin_unlock_irqrestore(&pcpu_lock, flags);
+
+	return ret;
 }
 
 /*
--


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-02  8:27 [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock Jeongjun Park
@ 2025-07-02 15:55 ` Christoph Lameter (Ampere)
  2025-07-03  4:45   ` Jeongjun Park
  2025-07-02 17:03 ` Shakeel Butt
  1 sibling, 1 reply; 9+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-07-02 15:55 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: dennis, tj, akpm, vbabka, rientjes, linux-mm, linux-kernel,
	syzbot+e5bd32b79413e86f389e

On Wed, 2 Jul 2025, Jeongjun Park wrote:

> diff --git a/mm/percpu.c b/mm/percpu.c
> index b35494c8ede2..0f98b857fb36 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
>   */
>  unsigned long pcpu_nr_pages(void)
>  {
> -	return pcpu_nr_populated * pcpu_nr_units;
> +	unsigned long flags, ret;
> +
> +	spin_lock_irqsave(&pcpu_lock, flags);
> +	ret = pcpu_nr_populated * pcpu_nr_units;
> +	spin_unlock_irqrestore(&pcpu_lock, flags);


Ummm.. What? You are protecting a single read with a spinlock? There needs
to be some updating of data somewhere for this to make sense.


Unless a different critical section protected by the lock sets the value
intermittendly to something you are not allowed to see before a final
store of a valid value. But that would be unusual.

This is an academic exercise or did you really see a problem?

What is racing?




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-02  8:27 [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock Jeongjun Park
  2025-07-02 15:55 ` Christoph Lameter (Ampere)
@ 2025-07-02 17:03 ` Shakeel Butt
  2025-07-03  5:19   ` Jeongjun Park
  1 sibling, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2025-07-02 17:03 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: dennis, tj, cl, akpm, vbabka, rientjes, linux-mm, linux-kernel,
	syzbot+e5bd32b79413e86f389e

On Wed, Jul 02, 2025 at 05:27:49PM +0900, Jeongjun Park wrote:
> Read/Write to pcpu_nr_populated should be performed while protected
> by pcpu_lock. However, pcpu_nr_pages() reads pcpu_nr_populated without any
> protection, which causes a data race between read/write.
> 
> Therefore, when reading pcpu_nr_populated in pcpu_nr_pages(), it should be
> modified to be protected by pcpu_lock.
> 
> Reported-by: syzbot+e5bd32b79413e86f389e@syzkaller.appspotmail.com
> Fixes: 7e8a6304d541 ("/proc/meminfo: add percpu populated pages count")
> Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> ---
>  mm/percpu.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/percpu.c b/mm/percpu.c
> index b35494c8ede2..0f98b857fb36 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
>   */
>  unsigned long pcpu_nr_pages(void)
>  {
> -	return pcpu_nr_populated * pcpu_nr_units;

No need for the lock as I think race is fine here. Use something like
the following and add a comment.

	data_race(READ_ONCE(pcpu_nr_populated)) * pcpu_nr_units;



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-02 15:55 ` Christoph Lameter (Ampere)
@ 2025-07-03  4:45   ` Jeongjun Park
  2025-07-03  5:51     ` Dennis Zhou
  0 siblings, 1 reply; 9+ messages in thread
From: Jeongjun Park @ 2025-07-03  4:45 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: dennis, tj, akpm, vbabka, rientjes, linux-mm, linux-kernel,
	syzbot+e5bd32b79413e86f389e

Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
>
> On Wed, 2 Jul 2025, Jeongjun Park wrote:
>
> > diff --git a/mm/percpu.c b/mm/percpu.c
> > index b35494c8ede2..0f98b857fb36 100644
> > --- a/mm/percpu.c
> > +++ b/mm/percpu.c
> > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> >   */
> >  unsigned long pcpu_nr_pages(void)
> >  {
> > -     return pcpu_nr_populated * pcpu_nr_units;
> > +     unsigned long flags, ret;
> > +
> > +     spin_lock_irqsave(&pcpu_lock, flags);
> > +     ret = pcpu_nr_populated * pcpu_nr_units;
> > +     spin_unlock_irqrestore(&pcpu_lock, flags);
>
>
> Ummm.. What? You are protecting a single read with a spinlock? There needs
> to be some updating of data somewhere for this to make sense.
>
>
> Unless a different critical section protected by the lock sets the value
> intermittendly to something you are not allowed to see before a final
> store of a valid value. But that would be unusual.
>
> This is an academic exercise or did you really see a problem?
>
> What is racing?
>
>

This patch is by no means an academic exercise.

As written in the reported tag, This race has actually been reported
in syzbot [1].

[1]: https://syzkaller.appspot.com/bug?extid=e5bd32b79413e86f389e

pcpu_nr_populated is currently being write in pcpu_chunk_populated()
and pcpu_chunk_depopulated(), and since this two functions perform
pcpu_nr_populated write under the protection of pcpu_lock, there is no
race for write/write.

However, since pcpu_nr_pages(), which performs a read operation on
pcpu_nr_populated, is not protected by pcpu_lock, races between read/write
can easily occur.

Therefore, I think it is appropriate to protect it through pcpu_lock
according to the comment written in the definition of pcpu_nr_populated.

--
Regards,

Jeongjun Park


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-02 17:03 ` Shakeel Butt
@ 2025-07-03  5:19   ` Jeongjun Park
  2025-07-03  5:57     ` Dennis Zhou
  0 siblings, 1 reply; 9+ messages in thread
From: Jeongjun Park @ 2025-07-03  5:19 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: dennis, tj, cl, akpm, vbabka, rientjes, linux-mm, linux-kernel,
	syzbot+e5bd32b79413e86f389e

Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Wed, Jul 02, 2025 at 05:27:49PM +0900, Jeongjun Park wrote:
> > Read/Write to pcpu_nr_populated should be performed while protected
> > by pcpu_lock. However, pcpu_nr_pages() reads pcpu_nr_populated without any
> > protection, which causes a data race between read/write.
> >
> > Therefore, when reading pcpu_nr_populated in pcpu_nr_pages(), it should be
> > modified to be protected by pcpu_lock.
> >
> > Reported-by: syzbot+e5bd32b79413e86f389e@syzkaller.appspotmail.com
> > Fixes: 7e8a6304d541 ("/proc/meminfo: add percpu populated pages count")
> > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > ---
> >  mm/percpu.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/percpu.c b/mm/percpu.c
> > index b35494c8ede2..0f98b857fb36 100644
> > --- a/mm/percpu.c
> > +++ b/mm/percpu.c
> > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> >   */
> >  unsigned long pcpu_nr_pages(void)
> >  {
> > -     return pcpu_nr_populated * pcpu_nr_units;
>
> No need for the lock as I think race is fine here. Use something like
> the following and add a comment.
>
>         data_race(READ_ONCE(pcpu_nr_populated)) * pcpu_nr_units;
>

This race itself is not a critical security vuln, but it is a read/write
race that actually occurs. Writing to pcpu_nr_populated is already
systematically protected through pcpu_lock, so why do you think you can
ignore the data race only when reading?

--
Regards,

Jeongjun Park


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-03  4:45   ` Jeongjun Park
@ 2025-07-03  5:51     ` Dennis Zhou
  2025-07-03  6:09       ` Jeongjun Park
  2025-07-03 16:39       ` Tejun Heo
  0 siblings, 2 replies; 9+ messages in thread
From: Dennis Zhou @ 2025-07-03  5:51 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: Christoph Lameter (Ampere), tj, akpm, vbabka, rientjes, linux-mm,
	linux-kernel, syzbot+e5bd32b79413e86f389e

Hello,

On Thu, Jul 03, 2025 at 01:45:36PM +0900, Jeongjun Park wrote:
> Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> >
> > On Wed, 2 Jul 2025, Jeongjun Park wrote:
> >
> > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > index b35494c8ede2..0f98b857fb36 100644
> > > --- a/mm/percpu.c
> > > +++ b/mm/percpu.c
> > > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> > >   */
> > >  unsigned long pcpu_nr_pages(void)
> > >  {
> > > -     return pcpu_nr_populated * pcpu_nr_units;
> > > +     unsigned long flags, ret;
> > > +
> > > +     spin_lock_irqsave(&pcpu_lock, flags);
> > > +     ret = pcpu_nr_populated * pcpu_nr_units;
> > > +     spin_unlock_irqrestore(&pcpu_lock, flags);
> >
> >
> > Ummm.. What? You are protecting a single read with a spinlock? There needs
> > to be some updating of data somewhere for this to make sense.
> >
> >
> > Unless a different critical section protected by the lock sets the value
> > intermittendly to something you are not allowed to see before a final
> > store of a valid value. But that would be unusual.
> >
> > This is an academic exercise or did you really see a problem?
> >
> > What is racing?
> >
> >
> 
> This patch is by no means an academic exercise.
> 
> As written in the reported tag, This race has actually been reported
> in syzbot [1].
> 
> [1]: https://syzkaller.appspot.com/bug?extid=e5bd32b79413e86f389e
> 

A report by syzbot doesn't mean it is a real problem. A production
problem or broken test case is much more urgent.

> pcpu_nr_populated is currently being write in pcpu_chunk_populated()
> and pcpu_chunk_depopulated(), and since this two functions perform
> pcpu_nr_populated write under the protection of pcpu_lock, there is no
> race for write/write.
> 
> However, since pcpu_nr_pages(), which performs a read operation on
> pcpu_nr_populated, is not protected by pcpu_lock, races between read/write
> can easily occur.
> 
> Therefore, I think it is appropriate to protect it through pcpu_lock
> according to the comment written in the definition of pcpu_nr_populated.
> 

You're right that this is a race condition, but this was an intention
choice done because the value read here is only being used to pass
information to userspace for /proc/meminfo. As Christoph mentioned, the
caller of pcpu_nr_pages() will never see an invalid value nor does it
really matter either.

The pcpu_lock is core to the percpu allocator and isn't something we
would want to blindly expose either.

The appropriate solution here is what Shakeel proposed to just mark the
access as a data_race().

Thanks,
Dennis


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-03  5:19   ` Jeongjun Park
@ 2025-07-03  5:57     ` Dennis Zhou
  0 siblings, 0 replies; 9+ messages in thread
From: Dennis Zhou @ 2025-07-03  5:57 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: Shakeel Butt, tj, cl, akpm, vbabka, rientjes, linux-mm,
	linux-kernel, syzbot+e5bd32b79413e86f389e

On Thu, Jul 03, 2025 at 02:19:34PM +0900, Jeongjun Park wrote:
> Shakeel Butt <shakeel.butt@linux.dev> wrote:
> >
> > On Wed, Jul 02, 2025 at 05:27:49PM +0900, Jeongjun Park wrote:
> > > Read/Write to pcpu_nr_populated should be performed while protected
> > > by pcpu_lock. However, pcpu_nr_pages() reads pcpu_nr_populated without any
> > > protection, which causes a data race between read/write.
> > >
> > > Therefore, when reading pcpu_nr_populated in pcpu_nr_pages(), it should be
> > > modified to be protected by pcpu_lock.
> > >
> > > Reported-by: syzbot+e5bd32b79413e86f389e@syzkaller.appspotmail.com
> > > Fixes: 7e8a6304d541 ("/proc/meminfo: add percpu populated pages count")
> > > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > > ---
> > >  mm/percpu.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > index b35494c8ede2..0f98b857fb36 100644
> > > --- a/mm/percpu.c
> > > +++ b/mm/percpu.c
> > > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> > >   */
> > >  unsigned long pcpu_nr_pages(void)
> > >  {
> > > -     return pcpu_nr_populated * pcpu_nr_units;
> >
> > No need for the lock as I think race is fine here. Use something like
> > the following and add a comment.
> >
> >         data_race(READ_ONCE(pcpu_nr_populated)) * pcpu_nr_units;
> >
> 
> This race itself is not a critical security vuln, but it is a read/write
> race that actually occurs. Writing to pcpu_nr_populated is already
> systematically protected through pcpu_lock, so why do you think you can
> ignore the data race only when reading?
> 

As mentioned in the other thread, the reader of this value is
/proc/meminfo and reading a stale value isn't the end of the world
either.

Thanks,
Dennis


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-03  5:51     ` Dennis Zhou
@ 2025-07-03  6:09       ` Jeongjun Park
  2025-07-03 16:39       ` Tejun Heo
  1 sibling, 0 replies; 9+ messages in thread
From: Jeongjun Park @ 2025-07-03  6:09 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Christoph Lameter (Ampere), tj, akpm, vbabka, rientjes, linux-mm,
	linux-kernel, syzbot+e5bd32b79413e86f389e

Hello,

Dennis Zhou <dennis@kernel.org> wrote:
>
> Hello,
>
> On Thu, Jul 03, 2025 at 01:45:36PM +0900, Jeongjun Park wrote:
> > Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> > >
> > > On Wed, 2 Jul 2025, Jeongjun Park wrote:
> > >
> > > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > > index b35494c8ede2..0f98b857fb36 100644
> > > > --- a/mm/percpu.c
> > > > +++ b/mm/percpu.c
> > > > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> > > >   */
> > > >  unsigned long pcpu_nr_pages(void)
> > > >  {
> > > > -     return pcpu_nr_populated * pcpu_nr_units;
> > > > +     unsigned long flags, ret;
> > > > +
> > > > +     spin_lock_irqsave(&pcpu_lock, flags);
> > > > +     ret = pcpu_nr_populated * pcpu_nr_units;
> > > > +     spin_unlock_irqrestore(&pcpu_lock, flags);
> > >
> > >
> > > Ummm.. What? You are protecting a single read with a spinlock? There needs
> > > to be some updating of data somewhere for this to make sense.
> > >
> > >
> > > Unless a different critical section protected by the lock sets the value
> > > intermittendly to something you are not allowed to see before a final
> > > store of a valid value. But that would be unusual.
> > >
> > > This is an academic exercise or did you really see a problem?
> > >
> > > What is racing?
> > >
> > >
> >
> > This patch is by no means an academic exercise.
> >
> > As written in the reported tag, This race has actually been reported
> > in syzbot [1].
> >
> > [1]: https://syzkaller.appspot.com/bug?extid=e5bd32b79413e86f389e
> >
>
> A report by syzbot doesn't mean it is a real problem. A production
> problem or broken test case is much more urgent.
>
> > pcpu_nr_populated is currently being write in pcpu_chunk_populated()
> > and pcpu_chunk_depopulated(), and since this two functions perform
> > pcpu_nr_populated write under the protection of pcpu_lock, there is no
> > race for write/write.
> >
> > However, since pcpu_nr_pages(), which performs a read operation on
> > pcpu_nr_populated, is not protected by pcpu_lock, races between read/write
> > can easily occur.
> >
> > Therefore, I think it is appropriate to protect it through pcpu_lock
> > according to the comment written in the definition of pcpu_nr_populated.
> >
>
> You're right that this is a race condition, but this was an intention
> choice done because the value read here is only being used to pass
> information to userspace for /proc/meminfo. As Christoph mentioned, the
> caller of pcpu_nr_pages() will never see an invalid value nor does it
> really matter either.
>
> The pcpu_lock is core to the percpu allocator and isn't something we
> would want to blindly expose either.
>
> The appropriate solution here is what Shakeel proposed to just mark the
> access as a data_race().
>
> Thanks,
> Dennis

If this data race was intentional, it makes sense why it was written
this way. I'll send v2 patch with the fix Shakeel proposed.
--
Regards,

Jeongjun Park


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
  2025-07-03  5:51     ` Dennis Zhou
  2025-07-03  6:09       ` Jeongjun Park
@ 2025-07-03 16:39       ` Tejun Heo
  1 sibling, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2025-07-03 16:39 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Jeongjun Park, Christoph Lameter (Ampere), akpm, vbabka, rientjes,
	linux-mm, linux-kernel, syzbot+e5bd32b79413e86f389e

On Wed, Jul 02, 2025 at 10:51:25PM -0700, Dennis Zhou wrote:
> > However, since pcpu_nr_pages(), which performs a read operation on
> > pcpu_nr_populated, is not protected by pcpu_lock, races between read/write
> > can easily occur.
> > 
> > Therefore, I think it is appropriate to protect it through pcpu_lock
> > according to the comment written in the definition of pcpu_nr_populated.
> 
> You're right that this is a race condition, but this was an intention
> choice done because the value read here is only being used to pass
> information to userspace for /proc/meminfo. As Christoph mentioned, the
> caller of pcpu_nr_pages() will never see an invalid value nor does it
> really matter either.

This isn't an actual race condition. The value can be read atomically and an
unprotected read can't lead to a result which wouldn't be possible when
reading under the lock. ie. Whether the lock is added or not, the end result
doesn't change. It's just that syzbot can't tell the difference.

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-07-03 16:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-02  8:27 [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock Jeongjun Park
2025-07-02 15:55 ` Christoph Lameter (Ampere)
2025-07-03  4:45   ` Jeongjun Park
2025-07-03  5:51     ` Dennis Zhou
2025-07-03  6:09       ` Jeongjun Park
2025-07-03 16:39       ` Tejun Heo
2025-07-02 17:03 ` Shakeel Butt
2025-07-03  5:19   ` Jeongjun Park
2025-07-03  5:57     ` Dennis Zhou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).