What's the right way to use a *large* number of source addresses?

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: nisse@southpole.se (Niels Möller)
To: netdev@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Subject: What's the right way to use a *large* number of source addresses?
Date: Fri, 23 May 2014 11:38:22 +0200	[thread overview]
Message-ID: <6zlhtsvnqp.fsf@southpole.se> (raw)

Hi,

I have a client doing traffic generation for load testing. When they
experienced performance problems with assigning a large number (say,
100000) ip addresses to an interface, they wrote a custom and
proprietary source NAT kernel module which let's you set any desired
source address on a socket, and then sets up SNAT to that address. This
was a couple of years ago, and it appears to have worked fine.

However, the code is a bit complex, and duplicates functionality in the
iptables SNAT target and the connection tracking machinery in the
current kernel. If I could relicense the module under a free license, I
suspect it would be shot down for technical reasons.

So now I'm trying to figure out what's the Right Way to enable traffic
generations with a large number of source addresses, to possibly retire
the proprietary kernel module. I see a couple of different approaches:

1. Simply assign all addresses to be used to the interface, fixing any
   remaining performance problems.

   I've done a simple benchmark with a script assigning n addresses
   using "ip address add", and this seems to have O(n^2) complexity.
   E.g, assigning n=25500 addresses took 26 s, and doubling n, assigning
   51000 addresses, took 122 s, 4.6 times longer. Which isn't
   necessarily a problems once all the addresses are assigned, but it
   sounds a bit like there's a linear datastructure in there, not
   intended for a large number of addresses.

   A way to add an address range (or a prefix), using a *single* entry
   of whatever datastructures are used, would help.

2. Do source NAT. I think the current SNAT target does almost everything
   needed. It could be extended with some setsockopt to set the desired
   address on a per socket basis. Not sure where to store that info;
   either associate the desired address with the socket, and have the
   SNAT module look for that. Or maybe one could have setsockopt create
   a conntrack entry in advance, prior to connect.

   Main drawback of using NAT is the overhead for connection tracking;
   it would be preferable if the only per-connection state needed is the
   socket itself.

3. Just set the desired local address with the bind call. Currently,
   this gives an EADDRNOTAVAIL error, so the first step would be some
   option to allow arbitrary source addresses.

   For an arbitrary source address, the network stack can't guess the
   intended interface. So one would also need to support something like
   SO_BINDTODEVICE to tell it explicitly.

   And for replies to be passed up to the transport layer, one must set
   up some processing of incoming packets to deliver them to the local
   machine. It's very unclear to me if there's any good way to do that,
   maybe one needs a conntrack entry for each connection just like with
   SNAT. Even with conntrack I think this approach is a bit cleaner than
   SNAT, in that the transport-layer 5-tuple would be based on the
   address that really is used on the wire.

What do you think? From a user perspective, I think I'd prefer either
(1), or (3) with a single setsockopt call which means "I'm going to use
an arbitrary source address. Transmit my packets over interface X, and
arrange the processing of incoming packets so that replies arrive to
this socket.", and then specify the desired source address with bind()
as usual.

Best regards,
/Niels Möller

next             reply	other threads:[~2014-05-23  9:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-23  9:38 Niels Möller [this message]
2014-05-23 10:11 ` What's the right way to use a *large* number of source addresses? Florian Westphal
2014-05-23 10:49   ` Lukas Tribus
2014-05-23 12:26   ` Niels Möller
2014-05-23 14:12   ` Niels Möller
2014-05-23 12:53 ` sowmini varadhan
2014-05-23 14:14   ` Eric Dumazet
2014-05-24 12:06     ` Jamal Hadi Salim
2014-05-23 22:39 ` Cong Wang
2014-05-26  6:39   ` Niels Möller
2014-05-24 11:58 ` Jamal Hadi Salim
2014-05-24 14:44   ` Richard Weinberger
2014-05-24 15:13     ` Jamal Hadi Salim
2014-05-24 16:02       ` Richard Weinberger
2014-05-24 17:54         ` David Miller
2014-05-24 18:30         ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6zlhtsvnqp.fsf@southpole.se \
    --to=nisse@southpole.se \
    --cc=jonas@southpole.se \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).