public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: akpm@linux-foundation.org, Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid interrupt disable / enable
Date: Wed, 15 Dec 2010 17:45:22 +0100	[thread overview]
Message-ID: <4D08F0A2.9010301@kernel.org> (raw)
In-Reply-To: <20101214162854.811759020@linux.com>

On 12/14/2010 05:28 PM, Christoph Lameter wrote:
> Currently the operations to increment vm counters must disable interrupts
> in order to not mess up their housekeeping of counters.
> 
> So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
> count on preremption being disabled we still have some minor issues.
> The fetching of the counter thresholds is racy.
> A threshold from another cpu may be applied if we happen to be
> rescheduled on another cpu.  However, the following vmstat operation
> will then bring the counter again under the threshold limit.
> 
> The operations for __xxx_zone_state are not changed since the caller
> has taken care of the synchronization needs (and therefore the cycle
> count is even less than the optimized version for the irq disable case
> provided here).
> 
> The optimization using this_cpu_cmpxchg will only be used if the arch
> supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!)
> 
> The use of this_cpu_cmpxchg reduces the cycle count for the counter
> operations by %80 (inc_zone_page_state goes from 170 cycles to 32).
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
+/*
+ * If we have cmpxchg_local support then we do not need to incur the overhead
+ * that comes with local_irq_save/restore if we use this_cpu_cmpxchg.
+ *
+ * mod_state() modifies the zone counter state through atomic per cpu
+ * operations.
+ *
+ * Overstep mode specifies how overstep should handled:
+ *     0       No overstepping
+ *     1       Overstepping half of threshold
+ *     -1      Overstepping minus half of threshold
+*/
+static inline void mod_state(struct zone *zone,
+       enum zone_stat_item item, int delta, int overstep_mode)
+{
+	struct per_cpu_pageset __percpu *pcp = zone->pageset;
+	s8 __percpu *p = pcp->vm_stat_diff + item;
+	long o, n, t, z;
+
+	do {
+		z = 0;  /* overflow to zone counters */
+
+		/*
+		 * The fetching of the stat_threshold is racy. We may apply
+		 * a counter threshold to the wrong the cpu if we get
+		 * rescheduled while executing here. However, the following
+		 * will apply the threshold again and therefore bring the
+		 * counter under the threshold.
+		 */

What does "the following" mean here?  Later executions of the
function?  It seems like the counter can go out of the threshold at
least temporarily, which probably is okay but I think the comment can
be improved a bit.

Thanks.

-- 
tejun

  reply	other threads:[~2010-12-15 16:45 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14 16:28 [cpuops cmpxchg V2 0/5] Cmpxchg and xchg operations Christoph Lameter
2010-12-14 16:28 ` [cpuops cmpxchg V2 1/5] percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support Christoph Lameter
2010-12-17 14:55   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 2/5] x86: this_cpu_cmpxchg and this_cpu_xchg operations Christoph Lameter
2010-12-17 15:22   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 3/5] irq_work: Use per cpu atomics instead of regular atomics Christoph Lameter
2010-12-15 16:32   ` Tejun Heo
2010-12-15 16:34     ` H. Peter Anvin
2010-12-15 16:50     ` Peter Zijlstra
2010-12-15 17:04       ` Christoph Lameter
2010-12-15 17:18         ` Peter Zijlstra
2010-12-15 17:31           ` H. Peter Anvin
2010-12-15 17:32           ` Christoph Lameter
2010-12-18 15:32   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid interrupt disable / enable Christoph Lameter
2010-12-15 16:45   ` Tejun Heo [this message]
2010-12-15 17:01     ` Christoph Lameter
2010-12-14 16:28 ` [cpuops cmpxchg V2 5/5] cpuops: Use cmpxchg for xchg to avoid lock semantics Christoph Lameter
2010-12-14 16:35   ` Mathieu Desnoyers
2010-12-14 16:44   ` Eric Dumazet
2010-12-14 16:55     ` Christoph Lameter
2010-12-14 17:00       ` H. Peter Anvin
2010-12-14 17:19         ` Christoph Lameter
2010-12-14 17:22           ` H. Peter Anvin
2010-12-14 17:29             ` Tejun Heo
2010-12-14 17:35               ` Christoph Lameter
2010-12-15  1:06               ` H. Peter Anvin
2010-12-15 16:29                 ` Tejun Heo
2010-12-15 16:35                   ` H. Peter Anvin
2010-12-15 16:39                     ` Tejun Heo
2010-12-16 16:14                       ` Tejun Heo
2010-12-16 18:13                         ` x86: Use this_cpu_has for thermal_interrupt Christoph Lameter
2010-12-18 15:35                           ` Tejun Heo
2010-12-21  0:56                             ` H. Peter Anvin
2010-12-30 11:29                               ` Tejun Heo
2010-12-30 18:19                                 ` H. Peter Anvin
2010-12-31 12:43                                   ` Tejun Heo
2010-12-16 18:14                         ` x86: udelay: Use this_cpu_read to avoid address calculation Christoph Lameter
2010-12-16 18:15                         ` gameport: use this_cpu_read instead of lookup Christoph Lameter
2010-12-18 15:34                           ` Tejun Heo
2010-12-16 18:16                         ` acpi throttling: Use this_cpu_has and simplify code Christoph Lameter
2010-12-18 15:50                           ` Tejun Heo
2010-12-21  1:52                             ` ykzhao
2010-12-21 22:43                             ` Christoph Lameter
2010-12-21  4:28                           ` Len Brown
2010-12-16 18:19                         ` [cpuops cmpxchg V2 5/5] cpuops: Use cmpxchg for xchg to avoid lock semantics H. Peter Anvin
2010-12-16 18:55                           ` Tejun Heo
2010-12-16 20:42                         ` H. Peter Anvin
2010-12-15 16:47   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D08F0A2.9010301@kernel.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox