netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Stephen Hemminger <shemminger@linux-foundation.org>,
	netdev@vger.kernel.org, bhutchings@solarflare.com,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH net-next-2.6] bridge: 64bit rx/tx counters
Date: Thu, 12 Aug 2010 15:11:45 -0700	[thread overview]
Message-ID: <20100812151145.f5fa259b.akpm@linux-foundation.org> (raw)
In-Reply-To: <1281649657.2305.38.camel@edumazet-laptop>

On Thu, 12 Aug 2010 23:47:37 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le jeudi 12 ao__t 2010 __ 08:07 -0700, Andrew Morton a __crit : 
> > On Thu, 12 Aug 2010 14:16:15 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > 
> > > > And all this open-coded per-cpu counter stuff added all over the place.
> > > > Were percpu_counters tested or reviewed and found inadequate and unfixable?
> > > > If so, please do tell.
> > > > 
> > > 
> > > percpu_counters tries hard to maintain a view of the current value of
> > > the (global) counter. This adds a cost because of a shared cache line
> > > and locking. (__percpu_counter_sum() is not very scalable on big hosts,
> > > it locks the percpu_counter lock for a possibly long iteration)
> > 
> > Could be.  Is percpu_counter_read_positive() unsuitable?
> > 
> 
> I bet most people want precise counters when doing 'ifconfig lo'
> 
> SNMP applications would be very surprised to get non increasing values
> between two samples, or inexact values.

percpu_counter_read_positive() should be returning monotonically
increasing numbers - if it ever went backward that would be bad.  But
yes, the value will increase in a lumpy fashion.  Probably one would
need to make informed choices between percpu_counter_read_positive()
and percpu_counter_sum(), depending on the type of stat.

But that's all a bit academic.

>
> > > And this folding has zero effect on
> > > concurrent writers (counter updates)
> > 
> > The fastpath looks a little expensive in the code you've added.  The
> > write_seqlock() does an rmw and a wmb() and the stats inc is a 64-bit
> > rmw whereas percpu_counters do a simple 32-bit add.  So I'd expect that
> > at some suitable batch value, percpu-counters are faster on 32-bit. 
> > 
> 
> Hmm... 6 instructions (16 bytes of text) are a "little expensive" versus
> 120 instructions if we use percpu_counter ?
> 
> Following code from drivers/net/loopback.c
> 
> 	u64_stats_update_begin(&lb_stats->syncp);
> 	lb_stats->bytes += len;
> 	lb_stats->packets++;
> 	u64_stats_update_end(&lb_stats->syncp);
> 
> maps on i386 to :
> 
> 	ff 46 10             	incl   0x10(%esi)  // u64_stats_update_begin(&lb_stats->syncp);
> 	89 f8                	mov    %edi,%eax
> 	99                   	cltd   
> 	01 7e 08             	add    %edi,0x8(%esi)
> 	11 56 0c             	adc    %edx,0xc(%esi)
> 	83 06 01             	addl   $0x1,(%esi)
> 	83 56 04 00          	adcl   $0x0,0x4(%esi)
> 	ff 46 10             	incl   0x10(%esi) // u64_stats_update_end(&lb_stats->syncp);
> 
> 
> Exactly 6 added instructions compared to previous kernel (32bit
> counters), only on 32bit hosts. These instructions are not expensive (no
> conditional branches, no extra register pressure) and access private cpu
> data.
> 
> While two calls to __percpu_counter_add() add about 120 instructions,
> even on 64bit hosts, wasting precious cpu cycles.

Oy.  You omitted the per_cpu_ptr() evaluation and, I bet, included all
the executed-1/batch-times instructions.

> 
> > They'll usually be slower on 64-bit, until that num_possible_cpus walk
> > bites you.
> > 
> 
> But are you aware we already fold SNMP values using for_each_possible()
> macros, before adding 64bit counters ? Not related to 64bit stuff
> really...


> > percpu_counters might need some work to make them irq-friendly.  That
> > bare spin_lock().
> > 
> > btw, I worry a bit about seqlocks in the presence of interrupts:
> > 
> 
> Please note that nothing is assumed about interrupts and seqcounts
> 
> Both readers and writers must mask them if necessary.
> 
> In most situations, masking softirq is enough for networking cases
> (updates are performed from softirq handler, reads from process context)

Yup, write_seqcount_begin/end() are pretty dangerous-looking.  The
caller needs to protect the lock against other CPUs, against interrupts
and even against preemption.



      reply	other threads:[~2010-08-12 22:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-14 15:59 [PATCH net-next-2.6] loopback: Implement 64bit stats on 32bit arches Eric Dumazet
2010-06-15  6:14 ` David Miller
2010-06-15  6:49   ` Nick Piggin
2010-06-15  7:23     ` Eric Dumazet
2010-06-15 10:14   ` [PATCH net-next-2.6] net: Introduce u64_stats_sync infrastructure Eric Dumazet
2010-06-15 10:25     ` Nick Piggin
2010-06-15 10:43       ` Eric Dumazet
2010-06-15 11:04         ` Nick Piggin
2010-06-15 12:12           ` Eric Dumazet
2010-06-15 13:29           ` [PATCH net-next-2.6 v2] " Eric Dumazet
2010-06-22 17:24             ` David Miller
2010-06-22 17:31               ` Eric Dumazet
2010-06-15 10:39     ` [PATCH net-next-2.6] bridge: 64bit rx/tx counters Eric Dumazet
2010-06-22 17:25       ` David Miller
2010-08-10  4:47       ` Andrew Morton
2010-08-12 12:16         ` Eric Dumazet
2010-08-12 15:07           ` Andrew Morton
2010-08-12 21:47             ` Eric Dumazet
2010-08-12 22:11               ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100812151145.f5fa259b.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=shemminger@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).