[PATCH] mm: page_alloc: control latency caused by zone PCP draining

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm: page_alloc: control latency caused by zone PCP draining
@ 2024-03-18 20:07 Lucas Stach
  2024-03-19 21:08 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Lucas Stach @ 2024-03-18 20:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, kernel, patchwork-lst, Mel Gorman

When the complete PCP is drained a much larger number of pages
than the usual batch size might be freed at once, causing large
IRQ and preemption latency spikes, as they are all freed while
holding the pcp and zone spinlocks.

To avoid those latency spikes, limit the number of pages freed
in a single bulk operation to common batch limits.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 mm/page_alloc.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a663202045dc..64a6f9823c8c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2215,12 +2215,15 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
  */
 static void drain_pages_zone(unsigned int cpu, struct zone *zone)
 {
-	struct per_cpu_pages *pcp;
+	struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
+	int count = READ_ONCE(pcp->count);
+
+	while (count) {
+		int to_drain = min(count, pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
+		count -= to_drain;
 
-	pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
-	if (pcp->count) {
 		spin_lock(&pcp->lock);
-		free_pcppages_bulk(zone, pcp->count, pcp, 0);
+		free_pcppages_bulk(zone, to_drain, pcp, 0);
 		spin_unlock(&pcp->lock);
 	}
 }
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: page_alloc: control latency caused by zone PCP draining
  2024-03-18 20:07 [PATCH] mm: page_alloc: control latency caused by zone PCP draining Lucas Stach
@ 2024-03-19 21:08 ` Andrew Morton
  2024-03-20 10:44   ` Lucas Stach
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2024-03-19 21:08 UTC (permalink / raw)
  To: Lucas Stach; +Cc: linux-mm, kernel, patchwork-lst, Mel Gorman

On Mon, 18 Mar 2024 21:07:36 +0100 Lucas Stach <l.stach@pengutronix.de> wrote:

> When the complete PCP is drained a much larger number of pages
> than the usual batch size might be freed at once,

How much larger?  Please include the numbers here.

> causing large
> IRQ and preemption latency spikes, as they are all freed while
> holding the pcp and zone spinlocks.

How large are these spikes?

> To avoid those latency spikes, limit the number of pages freed
> in a single bulk operation to common batch limits.
> 

And how large are they after this?

> ---
>  mm/page_alloc.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a663202045dc..64a6f9823c8c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2215,12 +2215,15 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
>   */
>  static void drain_pages_zone(unsigned int cpu, struct zone *zone)
>  {
> -	struct per_cpu_pages *pcp;
> +	struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
> +	int count = READ_ONCE(pcp->count);
> +
> +	while (count) {
> +		int to_drain = min(count, pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
> +		count -= to_drain;
>  
> -	pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
> -	if (pcp->count) {
>  		spin_lock(&pcp->lock);
> -		free_pcppages_bulk(zone, pcp->count, pcp, 0);
> +		free_pcppages_bulk(zone, to_drain, pcp, 0);
>  		spin_unlock(&pcp->lock);
>  	}

I'm not seeing what prevents two CPUs from trying to free the same
pages simultaneously?



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: page_alloc: control latency caused by zone PCP draining
  2024-03-19 21:08 ` Andrew Morton
@ 2024-03-20 10:44   ` Lucas Stach
  0 siblings, 0 replies; 3+ messages in thread
From: Lucas Stach @ 2024-03-20 10:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, kernel, patchwork-lst, Mel Gorman

Am Dienstag, dem 19.03.2024 um 14:08 -0700 schrieb Andrew Morton:
> On Mon, 18 Mar 2024 21:07:36 +0100 Lucas Stach <l.stach@pengutronix.de> wrote:
> 
> > When the complete PCP is drained a much larger number of pages
> > than the usual batch size might be freed at once,
> 
> How much larger?  Please include the numbers here.
> 
> > causing large
> > IRQ and preemption latency spikes, as they are all freed while
> > holding the pcp and zone spinlocks.
> 
> How large are these spikes?
> 
> > To avoid those latency spikes, limit the number of pages freed
> > in a single bulk operation to common batch limits.
> > 
> 
> And how large are they after this?
> 
Will do. However the tracing overhead on the little ARM systems where
this is hurting me is pretty significant, so the numbers with tracing
enabled need to be taken with a grain of salt.

> > ---
> >  mm/page_alloc.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index a663202045dc..64a6f9823c8c 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2215,12 +2215,15 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
> >   */
> >  static void drain_pages_zone(unsigned int cpu, struct zone *zone)
> >  {
> > -	struct per_cpu_pages *pcp;
> > +	struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
> > +	int count = READ_ONCE(pcp->count);
> > +
> > +	while (count) {
> > +		int to_drain = min(count, pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
> > +		count -= to_drain;
> >  
> > -	pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
> > -	if (pcp->count) {
> >  		spin_lock(&pcp->lock);
> > -		free_pcppages_bulk(zone, pcp->count, pcp, 0);
> > +		free_pcppages_bulk(zone, to_drain, pcp, 0);
> >  		spin_unlock(&pcp->lock);
> >  	}
> 
> I'm not seeing what prevents two CPUs from trying to free the same
> pages simultaneously?
> 
The spinlocks around the batches are still in place, so there is no
issue of multiple CPU trying to free the same pages. If you worry about
multiple CPUs taking turns in acquiring the locks and thus interleaving
at batch granularity, that's now possible but I don't think it's an
issue as free_pcppages_bulk will simply drop out when the are no more
pages on the PCP to free. So the CPUs will take batch sized turns in
draining the PCP, but all of them will stop gracefully when there are
no more pages to free.

Regards,
Lucas


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-03-20 10:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-18 20:07 [PATCH] mm: page_alloc: control latency caused by zone PCP draining Lucas Stach
2024-03-19 21:08 ` Andrew Morton
2024-03-20 10:44   ` Lucas Stach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).