netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin LaHaise <bcrl@lhnet.ca>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH/RFC] make unregister_netdev() delete more than 4 interfaces per second
Date: Sun, 18 Oct 2009 14:21:44 -0400	[thread overview]
Message-ID: <20091018182144.GC23395@kvack.org> (raw)
In-Reply-To: <4ADB55BC.5020107@gmail.com>

On Sun, Oct 18, 2009 at 07:51:56PM +0200, Eric Dumazet wrote:
> You forgot af_packet sendmsg() users, and heavy routers where route cache
> is stressed or disabled. I know several of them, they even added mmap TX 
> support to get better performance. They will be disapointed by your patch.

If that's a problem, the cacheline overhead is a more serious issue.  
AF_PACKET should really keep the reference on the device between syscalls.  
Do you have a benchmark in mind that would show the overhead?

> atomic_dec_and_test() is definitly more expensive, because of strong barrier
> semantics and added test after the decrement.
> refcnt being close to zero or not has not impact, even on 2 years old cpus.

At least on x86, the atomic_dec_and_test() cost is pretty much identical to 
atomic_dec().  If this really is a performance bottleneck, people should be 
complaining about the cache miss overhead and lock overhead which will dwarf 
the atomic_dec_and_test() cost vs atomic_dec().  Granted, I'm not saying 
that it isn't an issue on other architectures, but for x86 the lock prefix 
is what's expensive, not checking the flags or not after doing the operation.

If your complaint is about uninlining dev_put(), I'm indifferent to keeping 
it inline or out of line and can change the patch to suit.

> Machines hardly had to dismantle a netdevice in a normal lifetime, so maybe
> we were lazy with this insane msleep(250). This came from old linux times,
> when cpus were soooo slow and programers soooo lazy :)

It's only now that machines can actually route one or more 10Gbps links 
that it really becomes an issue.  I've been hacking around it for some 
time, but fixing it properly is starting to be a real requirement.

> The msleep(250) should be tuned first. Then if this is really necessary
> to dismantle 100.000 netdevices per second, we might have to think a bit more.
> 
> Just try msleep(1 or 2), it should work quite well.

My goal is tearing down 100,000 interfaces in a few seconds, which really is 
necessary.  Right now we're running about 40,000 interfaces on a not yet 
saturated 10Gbps link.  Going to dual 10Gbps links means pushing more than 
100,000 subscriber interfaces, and it looks like a modern dual socket system 
can handle that.

A bigger concern is rtnl_lock().  It is a huge impediment to scaling up 
interface creation/deletion on multicore systems.  That's going to be a 
lot more invasive to fix, though.

		-ben

  reply	other threads:[~2009-10-18 18:21 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-17 22:18 [PATCH/RFC] make unregister_netdev() delete more than 4 interfaces per second Benjamin LaHaise
2009-10-18  4:26 ` Eric Dumazet
2009-10-18 16:13   ` Benjamin LaHaise
2009-10-18 17:51     ` Eric Dumazet
2009-10-18 18:21       ` Benjamin LaHaise [this message]
2009-10-18 19:36         ` Eric Dumazet
2009-10-21 12:39         ` Octavian Purdila
2009-10-21 15:40           ` [PATCH] net: allow netdev_wait_allrefs() to run faster Eric Dumazet
2009-10-21 16:09             ` Eric Dumazet
2009-10-21 16:51             ` Benjamin LaHaise
2009-10-21 19:54               ` Eric Dumazet
2009-10-29 23:07               ` Eric W. Biederman
2009-10-29 23:38                 ` Benjamin LaHaise
2009-10-30  1:45                   ` Eric W. Biederman
2009-10-30 14:35                     ` Benjamin LaHaise
2009-10-30 14:43                       ` Eric Dumazet
2009-10-30 23:25                       ` Eric W. Biederman
2009-10-30 23:53                         ` Benjamin LaHaise
2009-10-31  0:37                           ` Eric W. Biederman
2010-08-09 17:23                   ` Ben Greear
2010-08-09 17:34                     ` Benjamin LaHaise
2010-08-09 17:44                       ` Ben Greear
2010-08-09 17:48                         ` Benjamin LaHaise
2010-08-09 18:03                           ` Ben Greear
2010-08-09 19:59                       ` Eric W. Biederman
2010-08-09 21:03                         ` Benjamin LaHaise
2010-08-09 21:17                           ` Eric W. Biederman
2009-10-21 16:55             ` Octavian Purdila
2009-10-23 21:13             ` Paul E. McKenney
2009-10-24  4:35               ` Eric Dumazet
2009-10-24  5:49                 ` Paul E. McKenney
2009-10-24  8:49                   ` Eric Dumazet
2009-10-24 13:52                     ` Paul E. McKenney
2009-10-24 14:24                       ` Eric Dumazet
2009-10-24 14:46                         ` Paul E. McKenney
2009-10-24 23:49                         ` Octavian Purdila
2009-10-25  4:47                           ` Paul E. McKenney
2009-10-25  8:35                           ` Eric Dumazet
2009-10-25 15:19                             ` Octavian Purdila
2009-10-25 19:28                               ` Eric Dumazet
2009-10-24 20:22                 ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091018182144.GC23395@kvack.org \
    --to=bcrl@lhnet.ca \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).