public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <compudj@krystal.dyndns.org>
To: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Yuriy Lalym <ylalym@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	ltt-dev@lists.casi.polymtl.ca, Tejun Heo <tj@kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [ltt-dev] [PATCH] Fix dirty page accounting in redirty_page_for_writepage()
Date: Sat, 2 May 2009 03:01:06 -0400	[thread overview]
Message-ID: <20090502070105.GA2772@Krystal> (raw)
In-Reply-To: <alpine.DEB.1.10.0905012257510.32364@qirst.com>

* Christoph Lameter (cl@linux.com) wrote:
> On Fri, 1 May 2009, Mathieu Desnoyers wrote:
> 
> > Then, if I understand you correctly, you propose :
> >
> > void __inc_zone_state(struct zone *zone, enum zone_stat_item item)
> > {
> >         ... assuming p references percpu "u8" counters ...
> >         u8 p_new;
> >
> >         p_new = percpu_add_return(p, 1);
> >
> >         if (unlikely(!(p_new & pcp->stat_threshold_mask)))
> >                 zone_page_state_add(pcp->stat_threshold, zone, item);
> > }
> >
> > void inc_zone_state(struct zone *zone, enum zone_stat_item item)
> > {
> >         ... assuming p references percpu "u8" counters ...
> >         u8 p_new;
> >
> >         p_new = percpu_add_return_irqsafe(p, 1);
> >
> >         if (unlikely(!(p_new & pcp->stat_threshold_mask)))
> >                 zone_page_state_add(pcp->stat_threshold, zone, item);
> > }
> >
> > (therefore opting for code duplication)
> >
> > Am I correct ?
> 
> Well __inc_zone_state is fine by itself. inc_zone_state will currently
> disable irqs. But we can do it your way and duplicate the code.
> 

OK, I think I see the source of disagreement here. Leaving
local_irq_save in place in inc_zone_state as you propose will degrade
performance significantly. x86 is not the only architecture not that
fast at disabling interrupts.

Let met get my hands on the numbers I've prepared for my upcoming paper
about tracer locking... here they are. These are in cycles.

Architecture           Speedup             CAS        Interrupts
                (cli+sti) / local CAS  local sync     sti   cli

AMD Athlon(tm)64 X2         4.57       7     17       17    15
Intel Core2                 6.33       6     30       20    18
Intel Xeon E5405            5.25       8     20       20    22
PowerPC G5                  4.00       1      2        3     1
PowerPC POWER6 4.2 GHz      1.77       9     17       14     2
Itanium 2 (single and dual-core, 1.6GHz)
                            1.33       3      3        2     2
UltraSPARC-IIIi (1.2GHz)(a) 0.64     0.394 0.394    0.094  0.159
        (a)  (in system bus clock cycles)

I've also got numbers for atomic add return.. it's usually a bit faster
than local CAS, except on architectures where atomic add return must be
emulated by a CAS (it's then very slightly slower). But in any case,
disabling/enabling interrupts is _WAY_ slower than local CAS or local
add return.

> > void __inc_zone_state(struct zone *zone, enum zone_stat_item item)
> > {
> >         ... assuming p references percpu "u8" counters ...
> >         u8 p_new;
> >
> >         p_new = __percpu_add_return_irqsafe(p, 1);
> >
> >         if (unlikely(!(p_new & pcp->stat_threshold_mask)))
> >                 zone_page_state_add(pcp->stat_threshold, zone, item);
> > }
> >
> > void inc_zone_state(struct zone *zone, enum zone_stat_item item)
> > {
> > 	unsigned long flags;
> >
> > 	percpu_local_irqsave(flags);
> > 	__inc_zone_state(zone, item);
> > 	percpu_local_irqrestore(flags);
> > }
> >
> > Which is more compact and does not duplicate code.
> 
> This is almost like the current code. But lets avoid percpu_local_irqs
> etc if we can.

If we want to get good performance out of those percpu ops, we have to
leave interrupts enabled. It's a factor ~5 slowdown otherwise on x86. So
the choice really comes down to : either we duplicate code, or we create
those percpu_local_irqsave-like primitives. BTW if the name is bad, we
may have to just find something better.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-05-02  7:01 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-29 23:25 [PATCH] Fix dirty page accounting in redirty_page_for_writepage() Mathieu Desnoyers
2009-04-29 23:56 ` Mathieu Desnoyers
2009-04-29 23:59 ` Andrew Morton
2009-04-30  2:34   ` Mathieu Desnoyers
2009-04-30  0:06 ` Linus Torvalds
2009-04-30  2:43   ` Mathieu Desnoyers
2009-04-30  6:21     ` Ingo Molnar
2009-04-30  6:33       ` [ltt-dev] " Mathieu Desnoyers
2009-04-30  6:50         ` Ingo Molnar
2009-04-30 13:38           ` Christoph Lameter
2009-04-30 14:10             ` Ingo Molnar
2009-04-30 14:12             ` Mathieu Desnoyers
2009-04-30 14:12               ` Christoph Lameter
2009-04-30 19:41                 ` Mathieu Desnoyers
2009-04-30 20:17                   ` Christoph Lameter
2009-04-30 21:17                     ` Mathieu Desnoyers
2009-05-01 13:44                       ` Christoph Lameter
2009-05-01 19:21                         ` Mathieu Desnoyers
2009-05-01 19:31                           ` Christoph Lameter
2009-05-01 20:24                             ` Mathieu Desnoyers
2009-05-01 20:28                               ` Christoph Lameter
2009-05-01 20:43                                 ` Mathieu Desnoyers
2009-05-01 20:42                                   ` Christoph Lameter
2009-05-01 21:19                                     ` Mathieu Desnoyers
2009-05-02  3:00                                       ` Christoph Lameter
2009-05-02  7:01                                         ` Mathieu Desnoyers [this message]
2009-05-02 21:01                             ` Mathieu Desnoyers
2009-05-04 14:08                               ` Christoph Lameter
2009-05-03  2:40       ` Tejun Heo
2009-05-04 14:10         ` Christoph Lameter
2009-04-30 13:22     ` Christoph Lameter
2009-04-30 13:38       ` Ingo Molnar
2009-04-30 13:40         ` Christoph Lameter
2009-04-30 14:14           ` Ingo Molnar
2009-04-30 14:15             ` Christoph Lameter
2009-04-30 14:38               ` Ingo Molnar
2009-04-30 14:45                 ` Christoph Lameter
2009-04-30 15:01                   ` Ingo Molnar
2009-04-30 15:25                     ` Christoph Lameter
2009-04-30 15:42                       ` Ingo Molnar
2009-04-30 15:44                         ` Christoph Lameter
2009-04-30 16:06                           ` Ingo Molnar
2009-04-30 16:11                             ` Christoph Lameter
2009-04-30 16:16                             ` Linus Torvalds
2009-04-30 17:23                               ` Ingo Molnar
2009-04-30 18:07                                 ` Christoph Lameter
2009-05-01 19:59                                   ` Ingo Molnar
2009-05-01 20:35                                     ` Christoph Lameter
2009-05-01 21:07                                       ` Ingo Molnar
2009-05-02  3:06                                         ` Christoph Lameter
2009-05-02  9:03                                           ` Ingo Molnar
2009-05-04 14:48                                             ` Christoph Lameter
2009-04-30 16:13                         ` Linus Torvalds
2009-04-30 15:54                       ` Ingo Molnar
2009-04-30 16:00                       ` Ingo Molnar
2009-04-30 16:08                         ` Christoph Lameter
2009-04-30 13:50         ` Mathieu Desnoyers
2009-04-30 13:55           ` Christoph Lameter
2009-04-30 14:32           ` Ingo Molnar
2009-04-30 14:42             ` Christoph Lameter
2009-04-30 14:59               ` Ingo Molnar
2009-04-30 16:03             ` [ltt-dev] " Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090502070105.GA2772@Krystal \
    --to=compudj@krystal.dyndns.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=ylalym@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox