linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/2] vmstat: Speedup and Cleanup
@ 2016-02-22 18:10 Christoph Lameter
  2016-02-22 18:10 ` [patch 1/2] vmstat: Optimize refresh_cpu_vmstat() Christoph Lameter
  2016-02-22 18:10 ` [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable Christoph Lameter
  0 siblings, 2 replies; 6+ messages in thread
From: Christoph Lameter @ 2016-02-22 18:10 UTC (permalink / raw)
  To: akpm; +Cc: Michal Hocko, Tejun Heo, Tetsuo Handa, linux-mm, hannes, mgorman

I have not been too satisfied with the recent code changes and in
particular not fond of the races and the performance regressions
I have seen. So this is an attempt to address those by making some
small code changes and by removing the cpu_stat_off cpumask which
was the reason for the races.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch 1/2] vmstat: Optimize refresh_cpu_vmstat()
  2016-02-22 18:10 [patch 0/2] vmstat: Speedup and Cleanup Christoph Lameter
@ 2016-02-22 18:10 ` Christoph Lameter
  2016-02-24 17:38   ` Michal Hocko
  2016-02-22 18:10 ` [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable Christoph Lameter
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Lameter @ 2016-02-22 18:10 UTC (permalink / raw)
  To: akpm; +Cc: Michal Hocko, Tejun Heo, Tetsuo Handa, linux-mm, hannes, mgorman

[-- Attachment #1: vmstat_speed_up --]
[-- Type: text/plain, Size: 3202 bytes --]

Create a new function zone_needs_update() that uses a memchr to check
all diffs for being nonzero first.

If we use this function in refresh_cpu_vm_stat() then we can avoid the
this_cpu_xchg() loop over all differentials. This becomes in particular
important as the number of counters keeps on increasing.

This also avoids modifying the cachelines with the differentials
unnecessarily.

Also add some likely()s to ensure that the icache requirements
are low when we do not have any updates to process.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/vmstat.c
===================================================================
--- linux.orig/mm/vmstat.c	2016-02-22 11:54:02.179095030 -0600
+++ linux/mm/vmstat.c	2016-02-22 11:54:24.338528277 -0600
@@ -444,6 +444,18 @@ static int fold_diff(int *diff)
 	return changes;
 }
 
+bool zone_needs_update(struct per_cpu_pageset *p)
+{
+
+	BUILD_BUG_ON(sizeof(p->vm_stat_diff[0]) != 1);
+	/*
+	 * The fast way of checking if there are any vmstat diffs.
+	 * This works because the diffs are byte sized items.
+	 */
+	return memchr_inv(p->vm_stat_diff, 0,
+			NR_VM_ZONE_STAT_ITEMS) != NULL;
+}
+
 /*
  * Update the zone counters for the current cpu.
  *
@@ -470,18 +482,20 @@ static int refresh_cpu_vm_stats(bool do_
 	for_each_populated_zone(zone) {
 		struct per_cpu_pageset __percpu *p = zone->pageset;
 
-		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
-			int v;
+		if (unlikely(zone_needs_update(this_cpu_ptr(p)))) {
+			for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
+				int v;
 
-			v = this_cpu_xchg(p->vm_stat_diff[i], 0);
-			if (v) {
+				v = this_cpu_xchg(p->vm_stat_diff[i], 0);
+				if (v) {
 
-				atomic_long_add(v, &zone->vm_stat[i]);
-				global_diff[i] += v;
+					atomic_long_add(v, &zone->vm_stat[i]);
+					global_diff[i] += v;
 #ifdef CONFIG_NUMA
-				/* 3 seconds idle till flush */
-				__this_cpu_write(p->expire, 3);
+					/* 3 seconds idle till flush */
+					__this_cpu_write(p->expire, 3);
 #endif
+				}
 			}
 		}
 #ifdef CONFIG_NUMA
@@ -494,8 +508,8 @@ static int refresh_cpu_vm_stats(bool do_
 			 * Check if there are pages remaining in this pageset
 			 * if not then there is nothing to expire.
 			 */
-			if (!__this_cpu_read(p->expire) ||
-			       !__this_cpu_read(p->pcp.count))
+			if (likely(!__this_cpu_read(p->expire) ||
+			       !__this_cpu_read(p->pcp.count)))
 				continue;
 
 			/*
@@ -1440,19 +1454,12 @@ static bool need_update(int cpu)
 	for_each_populated_zone(zone) {
 		struct per_cpu_pageset *p = per_cpu_ptr(zone->pageset, cpu);
 
-		BUILD_BUG_ON(sizeof(p->vm_stat_diff[0]) != 1);
-		/*
-		 * The fast way of checking if there are any vmstat diffs.
-		 * This works because the diffs are byte sized items.
-		 */
-		if (memchr_inv(p->vm_stat_diff, 0, NR_VM_ZONE_STAT_ITEMS))
+		if (zone_needs_update(p))
 			return true;
-
 	}
 	return false;
 }
 
-
 /*
  * Shepherd worker thread that checks the
  * differentials of processors that have their worker

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable
  2016-02-22 18:10 [patch 0/2] vmstat: Speedup and Cleanup Christoph Lameter
  2016-02-22 18:10 ` [patch 1/2] vmstat: Optimize refresh_cpu_vmstat() Christoph Lameter
@ 2016-02-22 18:10 ` Christoph Lameter
  2016-02-24  0:23   ` Andrew Morton
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Lameter @ 2016-02-22 18:10 UTC (permalink / raw)
  To: akpm; +Cc: Michal Hocko, Tejun Heo, Tetsuo Handa, linux-mm, hannes, mgorman

[-- Attachment #1: vmstat_no_cpu_off --]
[-- Type: text/plain, Size: 3752 bytes --]

The cpu_stat_off variable is unecessary since we can check if
a workqueue request is pending otherwise. This makes it pretty
easy for the shepherd to ensure that the proper things happen.

Removing the state also removes all races related to it.
Should a workqueue not be scheduled as needed for vmstat_update
then the shepherd will notice and schedule it as needed.
Should a workqueue be unecessarily scheduled then the vmstat
updater will disable it.

Thus vmstat_idle can also be simplified.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/vmstat.c
===================================================================
--- linux.orig/mm/vmstat.c	2016-02-22 11:55:59.432096146 -0600
+++ linux/mm/vmstat.c	2016-02-22 12:01:22.883825094 -0600
@@ -1401,7 +1401,6 @@ static const struct file_operations proc
 static struct workqueue_struct *vmstat_wq;
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
-static cpumask_var_t cpu_stat_off;
 
 static void vmstat_update(struct work_struct *w)
 {
@@ -1414,15 +1413,6 @@ static void vmstat_update(struct work_st
 		queue_delayed_work_on(smp_processor_id(), vmstat_wq,
 			this_cpu_ptr(&vmstat_work),
 			round_jiffies_relative(sysctl_stat_interval));
-	} else {
-		/*
-		 * We did not update any counters so the app may be in
-		 * a mode where it does not cause counter updates.
-		 * We may be uselessly running vmstat_update.
-		 * Defer the checking for differentials to the
-		 * shepherd thread on a different processor.
-		 */
-		cpumask_set_cpu(smp_processor_id(), cpu_stat_off);
 	}
 }
 
@@ -1436,11 +1426,8 @@ void quiet_vmstat(void)
 	if (system_state != SYSTEM_RUNNING)
 		return;
 
-	do {
-		if (!cpumask_test_and_set_cpu(smp_processor_id(), cpu_stat_off))
-			cancel_delayed_work(this_cpu_ptr(&vmstat_work));
-
-	} while (refresh_cpu_vm_stats(false));
+	refresh_cpu_vm_stats(false);
+	cancel_delayed_work(this_cpu_ptr(&vmstat_work));
 }
 
 /*
@@ -1476,13 +1463,12 @@ static void vmstat_shepherd(struct work_
 
 	get_online_cpus();
 	/* Check processors whose vmstat worker threads have been disabled */
-	for_each_cpu(cpu, cpu_stat_off)
-		if (need_update(cpu) &&
-			cpumask_test_and_clear_cpu(cpu, cpu_stat_off))
-
-			queue_delayed_work_on(cpu, vmstat_wq,
-				&per_cpu(vmstat_work, cpu), 0);
+	for_each_online_cpu(cpu) {
+		struct delayed_work *worker = &per_cpu(vmstat_work, cpu);
 
+		if (!delayed_work_pending(worker) && need_update(cpu))
+			queue_delayed_work_on(cpu, vmstat_wq, worker, 0);
+	}
 	put_online_cpus();
 
 	schedule_delayed_work(&shepherd,
@@ -1498,10 +1484,6 @@ static void __init start_shepherd_timer(
 		INIT_DELAYED_WORK(per_cpu_ptr(&vmstat_work, cpu),
 			vmstat_update);
 
-	if (!alloc_cpumask_var(&cpu_stat_off, GFP_KERNEL))
-		BUG();
-	cpumask_copy(cpu_stat_off, cpu_online_mask);
-
 	vmstat_wq = alloc_workqueue("vmstat", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
 	schedule_delayed_work(&shepherd,
 		round_jiffies_relative(sysctl_stat_interval));
@@ -1536,16 +1518,13 @@ static int vmstat_cpuup_callback(struct
 	case CPU_ONLINE_FROZEN:
 		refresh_zone_stat_thresholds();
 		node_set_state(cpu_to_node(cpu), N_CPU);
-		cpumask_set_cpu(cpu, cpu_stat_off);
 		break;
 	case CPU_DOWN_PREPARE:
 	case CPU_DOWN_PREPARE_FROZEN:
 		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
-		cpumask_clear_cpu(cpu, cpu_stat_off);
 		break;
 	case CPU_DOWN_FAILED:
 	case CPU_DOWN_FAILED_FROZEN:
-		cpumask_set_cpu(cpu, cpu_stat_off);
 		break;
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable
  2016-02-22 18:10 ` [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable Christoph Lameter
@ 2016-02-24  0:23   ` Andrew Morton
  2016-02-24 10:07     ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2016-02-24  0:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Michal Hocko, Tejun Heo, Tetsuo Handa, linux-mm, hannes, mgorman

On Mon, 22 Feb 2016 12:10:42 -0600 Christoph Lameter <cl@linux.com> wrote:

> The cpu_stat_off variable is unecessary since we can check if
> a workqueue request is pending otherwise. This makes it pretty
> easy for the shepherd to ensure that the proper things happen.
> 
> Removing the state also removes all races related to it.
> Should a workqueue not be scheduled as needed for vmstat_update
> then the shepherd will notice and schedule it as needed.
> Should a workqueue be unecessarily scheduled then the vmstat
> updater will disable it.
> 
> Thus vmstat_idle can also be simplified.

I'm getting rather a lot of rejects from this one.

>  
> @@ -1436,11 +1426,8 @@ void quiet_vmstat(void)
>  	if (system_state != SYSTEM_RUNNING)
>  		return;
>  
> -	do {
> -		if (!cpumask_test_and_set_cpu(smp_processor_id(), cpu_stat_off))
> -			cancel_delayed_work(this_cpu_ptr(&vmstat_work));
> -
> -	} while (refresh_cpu_vm_stats(false));
> +	refresh_cpu_vm_stats(false);
> +	cancel_delayed_work(this_cpu_ptr(&vmstat_work));
>  }

I can't find a quiet_vmstat() which looks like this.  What tree are you
patching?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable
  2016-02-24  0:23   ` Andrew Morton
@ 2016-02-24 10:07     ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2016-02-24 10:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Tejun Heo, Tetsuo Handa, linux-mm, hannes,
	mgorman

On Tue 23-02-16 16:23:45, Andrew Morton wrote:
> On Mon, 22 Feb 2016 12:10:42 -0600 Christoph Lameter <cl@linux.com> wrote:
> 
> > The cpu_stat_off variable is unecessary since we can check if
> > a workqueue request is pending otherwise. This makes it pretty
> > easy for the shepherd to ensure that the proper things happen.
> > 
> > Removing the state also removes all races related to it.
> > Should a workqueue not be scheduled as needed for vmstat_update
> > then the shepherd will notice and schedule it as needed.
> > Should a workqueue be unecessarily scheduled then the vmstat
> > updater will disable it.
> > 
> > Thus vmstat_idle can also be simplified.
> 
> I'm getting rather a lot of rejects from this one.
> 
> >  
> > @@ -1436,11 +1426,8 @@ void quiet_vmstat(void)
> >  	if (system_state != SYSTEM_RUNNING)
> >  		return;
> >  
> > -	do {
> > -		if (!cpumask_test_and_set_cpu(smp_processor_id(), cpu_stat_off))
> > -			cancel_delayed_work(this_cpu_ptr(&vmstat_work));
> > -
> > -	} while (refresh_cpu_vm_stats(false));
> > +	refresh_cpu_vm_stats(false);
> > +	cancel_delayed_work(this_cpu_ptr(&vmstat_work));
> >  }
> 
> I can't find a quiet_vmstat() which looks like this.  What tree are you
> patching?

This seems to be pre f01f17d3705b ("mm, vmstat: make quiet_vmstat
lighter")

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 1/2] vmstat: Optimize refresh_cpu_vmstat()
  2016-02-22 18:10 ` [patch 1/2] vmstat: Optimize refresh_cpu_vmstat() Christoph Lameter
@ 2016-02-24 17:38   ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2016-02-24 17:38 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, Tejun Heo, Tetsuo Handa, linux-mm, hannes, mgorman

On Mon 22-02-16 12:10:41, Christoph Lameter wrote:
> Create a new function zone_needs_update() that uses a memchr to check
> all diffs for being nonzero first.
> 
> If we use this function in refresh_cpu_vm_stat() then we can avoid the
> this_cpu_xchg() loop over all differentials. This becomes in particular
> important as the number of counters keeps on increasing.
> 
> This also avoids modifying the cachelines with the differentials
> unnecessarily.
> 
> Also add some likely()s to ensure that the icache requirements
> are low when we do not have any updates to process.

Do you have any numbers? Can you actually measure an interference of
refresh_cpu_vmstat?

> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> Index: linux/mm/vmstat.c
> ===================================================================
> --- linux.orig/mm/vmstat.c	2016-02-22 11:54:02.179095030 -0600
> +++ linux/mm/vmstat.c	2016-02-22 11:54:24.338528277 -0600
> @@ -444,6 +444,18 @@ static int fold_diff(int *diff)
>  	return changes;
>  }
>  
> +bool zone_needs_update(struct per_cpu_pageset *p)

static bool ....

> +{
> +
> +	BUILD_BUG_ON(sizeof(p->vm_stat_diff[0]) != 1);
> +	/*
> +	 * The fast way of checking if there are any vmstat diffs.
> +	 * This works because the diffs are byte sized items.
> +	 */
> +	return memchr_inv(p->vm_stat_diff, 0,
> +			NR_VM_ZONE_STAT_ITEMS) != NULL;
> +}
> +
>  /*
>   * Update the zone counters for the current cpu.
>   *
> @@ -470,18 +482,20 @@ static int refresh_cpu_vm_stats(bool do_
>  	for_each_populated_zone(zone) {
>  		struct per_cpu_pageset __percpu *p = zone->pageset;
>  
> -		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
> -			int v;
> +		if (unlikely(zone_needs_update(this_cpu_ptr(p)))) {

why unlikely? The generated code looks exactly same with or without it
(same for the other likely annotation added by this patch).

[...]
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-24 17:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-22 18:10 [patch 0/2] vmstat: Speedup and Cleanup Christoph Lameter
2016-02-22 18:10 ` [patch 1/2] vmstat: Optimize refresh_cpu_vmstat() Christoph Lameter
2016-02-24 17:38   ` Michal Hocko
2016-02-22 18:10 ` [patch 2/2] vmstat: Get rid of the ugly cpu_stat_off variable Christoph Lameter
2016-02-24  0:23   ` Andrew Morton
2016-02-24 10:07     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).