Using iptables with high volume mail

All of lore.kernel.org
 help / color / mirror / Atom feed

* Using iptables with high volume mail
@ 2009-10-01 11:42 John Little
  2009-10-01 11:54 ` Richard Horton
  2009-10-01 16:03 ` Thomas Jacob
  0 siblings, 2 replies; 11+ messages in thread
From: John Little @ 2009-10-01 11:42 UTC (permalink / raw)
  To: netfilter

Hi all,

I work for a major email service provider.  Our
management has asked us to investigate using iptables as "NAT engine"
for outbound mail.

The outbound mail is the only traffic the
server will see.  No inbound mail, web, etc.  The machine(s) will have
a public facing NIC and a NIC for the internal LAN.

The machines will see over 1 million emails in a 24 hour period.

My questions are:
Can iptables handle this volume?

What modules, tables and rules to use to optimize iptables for this type volume?  All of the mail is sent on the standard port 25.  We need to optimize for quick deliverability.  (I've read the man page and looked at TOS with the mangle table.  I read somewhere that this only for udp.) 

Is there a way to estimate how much hardware we would need for a given volume of mail?

Are there any use cases that I can show management?

Is there commercial support available?

We really want to sell this to management.  We have gone through 2 major brands of commercial devices for NATting that aren't making the gradefor what we are paying.  Any ideas and insights appreciated.

Thanks,
John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 11:42 Using iptables with high volume mail John Little
@ 2009-10-01 11:54 ` Richard Horton
  2009-10-01 12:45   ` John Little
  2009-10-01 16:03 ` Thomas Jacob
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Horton @ 2009-10-01 11:54 UTC (permalink / raw)
  To: John Little; +Cc: netfilter

2009/10/1 John Little <jlittle_97@yahoo.com>:

> What modules, tables and rules to use to optimize iptables for this type volume?  All of the mail is sent on the standard port 25.  We need to optimize for quick deliverability.  (I've read the man page and looked at TOS with the mangle table.  I read somewhere that this only for udp.)

Setting the DSCP / ToS field via mangle will work with IP traffic
regardless of payload type (UDP/TCP/IPSEC Tunnelled/etc). However,
there is only any point in applying it for 'quick' delivery if the
upstream routers are configured to apply a diffserv policy on a per
hop basis.

Apart from that 'quick delivery' isn't really something diffserv can
give you: EF traffic (Expedited forwarding) is intended for real-time
jitter sensitive traffic where loss is less of an issue than excessive
inter-packet delay. For reliable delivery use an AFxx class. However I
don't believe applying diffserv / tos in your case will achieve the
end results you are looking for unless you have control over all the
hops along the mail path, or SLA's in place with the network
provider(s) -- and usually once you exceed your purchased amount of
traffic within a class its either remarked or dropped - and strictly
under diffserv should be dropped as you should not remark outside of a
class.

-- 
Richard Horton
Users are like a virus: Each causing a thousand tiny crises until the
host finally dies.
http://www.solstans.co.uk - Solstans Japanese Bobtails and Norwegian Forest Cats
http://www.pbase.com/arimus - My online photogallery

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 11:54 ` Richard Horton
@ 2009-10-01 12:45   ` John Little
  0 siblings, 0 replies; 11+ messages in thread
From: John Little @ 2009-10-01 12:45 UTC (permalink / raw)
  To: Richard Horton; +Cc: netfilter



----- Original Message ----
> From: Richard Horton <arimus.uk@googlemail.com>
> To: John Little <jlittle_97@yahoo.com>
> Cc: netfilter@vger.kernel.org
> Sent: Thursday, October 1, 2009 7:54:06 AM
> Subject: Re: Using iptables with high volume mail
> 
> 2009/10/1 John Little :
> 
> > What modules, tables and rules to use to optimize iptables for this type 
> volume?  All of the mail is sent on the standard port 25.  We need to optimize 
> for quick deliverability.  (I've read the man page and looked at TOS with the 
> mangle table.  I read somewhere that this only for udp.)
> 
> Setting the DSCP / ToS field via mangle will work with IP traffic
> regardless of payload type (UDP/TCP/IPSEC Tunnelled/etc). However,
> there is only any point in applying it for 'quick' delivery if the
> upstream routers are configured to apply a diffserv policy on a per
> hop basis.
> 
> Apart from that 'quick delivery' isn't really something diffserv can
> give you: EF traffic (Expedited forwarding) is intended for real-time
> jitter sensitive traffic where loss is less of an issue than excessive
> inter-packet delay. For reliable delivery use an AFxx class. However I
> don't believe applying diffserv / tos in your case will achieve the
> end results you are looking for unless you have control over all the
> hops along the mail path, or SLA's in place with the network
> provider(s) -- and usually once you exceed your purchased amount of
> traffic within a class its either remarked or dropped - and strictly
> under diffserv should be dropped as you should not remark outside of a
> class.
> 
> -- 
> Richard Horton
> Users are like a virus: Each causing a thousand tiny crises until the
> host finally dies.
> http://www.solstans.co.uk - Solstans Japanese Bobtails and Norwegian Forest Cats
> http://www.pbase.com/arimus - My online photogallery


Hi Richard

Good point.  We don't control the hops on the mail path.  We also
strictly observe the traffic rules that we have agreed to with the
upstream providers.  

As I think about my question and your answers the next part would be
that we want to "streamline" our iptables rules so that they are
working efficiently and not consuming any more resources than are
necessary.  To that end I would think that I would probably need to have some rules written and post them here for review.  

Resource consumption has been a major issue with the commercial devices
we have tried.  This has led to the question of building the machines
with iptables that are tuned specifically for our environment.  

I realize that other kernel tuning parameters need to be factored in as
well.  I'm just want to make sure we have all of our bases covered.

Thanks,
John



      

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 11:42 Using iptables with high volume mail John Little
  2009-10-01 11:54 ` Richard Horton
@ 2009-10-01 16:03 ` Thomas Jacob
  2009-10-01 16:40   ` Gáspár Lajos
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Jacob @ 2009-10-01 16:03 UTC (permalink / raw)
  To: John Little; +Cc: netfilter

[-- Attachment #1: Type: text/plain, Size: 1454 bytes --]

> What modules, tables and rules to use to optimize iptables for this type volume?  All of the mail is sent on the standard port 25.  We need to optimize for quick deliverability.  (I've read the man page and looked at TOS with the mangle table.  I read somewhere that this only for udp.) 
> 
> Is there a way to estimate how much hardware we would need for a given volume of mail?

This all really depends on the number of new connections and packets per
time, rather than the number of emails.

Assuming that you'll be sending the 1 million email per day on one
machine, and that you only need one connection per email, we are
talking about 11 cps and maybe 20 times as many packets on
average (or possibly higher, you should measure that).

If you'd just be doing connection tracking, that would not
even heat the CPUs of your standard of the shelf dual core server
with, for instance, 2 good e1000e NICs, very much, let
alone lead to bottlenecks in the near future (2 cores only because
each NIC interrupt usually can only be bound to one core).

We've been running 80.000+ pps / 8000+ cps on such machines
without any problems. Iptables beats all other free software
firewalls by orders of magnitude in terms of raw
forwarding speed (There was a test in a German IT mag a
couple of years ago that established this).

Now whether or NATing changes these relationships much
I do not know, but I'd doubt it.

    Thomas

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 16:03 ` Thomas Jacob
@ 2009-10-01 16:40   ` Gáspár Lajos
  2009-10-01 19:39     ` John Little
  0 siblings, 1 reply; 11+ messages in thread
From: Gáspár Lajos @ 2009-10-01 16:40 UTC (permalink / raw)
  To: Thomas Jacob; +Cc: John Little, netfilter

Thomas Jacob írta:
> This all really depends on the number of new connections and packets per
> time, rather than the number of emails.
>   
Agree.
But I would also check the upstream bandwidth and concurrent connections 
per time or per destination.

Swifty


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 16:40   ` Gáspár Lajos
@ 2009-10-01 19:39     ` John Little
  2009-10-02 12:31       ` Thomas Jacob
  0 siblings, 1 reply; 11+ messages in thread
From: John Little @ 2009-10-01 19:39 UTC (permalink / raw)
  To: Gáspár Lajos, Thomas Jacob; +Cc: netfilter

----- Original Message ----
> From: Gáspár Lajos <swifty@freemail.hu>
> To: Thomas Jacob <jacob@internet24.de>
> Cc: John Little <jlittle_97@yahoo.com>; netfilter@vger.kernel.org
> Sent: Thursday, October 1, 2009 12:40:51 PM
> Subject: Re: Using iptables with high volume mail
> 
> Thomas Jacob írta:
> > This all really depends on the number of new connections and packets per
> > time, rather than the number of emails.
> >  
> Agree.
> But I would also check the upstream bandwidth and concurrent connections per 
> time or per destination.
> 
> Swifty

@swifty - Yes we have agreements with upstream and manage our connections there with throttling so that we don't exceed our allotted connections is a given time period.

@thomas Thanks for those metrics.  We are looking to see if the connections per second is generated with our current devices.  What we do know is that our max outbound connections will get as high as 16000 for a period of time (maybe 2-4 hours) and will occasionally burst up to around 20000.

How does that compare to the metrics that you mentioned earlier?

Thanks,
John  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-01 19:39     ` John Little
@ 2009-10-02 12:31       ` Thomas Jacob
  2009-10-02 13:50         ` John Little
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Jacob @ 2009-10-02 12:31 UTC (permalink / raw)
  To: John Little; +Cc: Gáspár Lajos, netfilter

[-- Attachment #1: Type: text/plain, Size: 1840 bytes --]

> @thomas Thanks for those metrics.  We are looking to see if the connections per second is
> generated with our current devices.  What we do know is that our max
>outbound connections will get as high as 16000 for a period of time >
>(maybe 2-4 hours) and will occasionally burst up to around 20000.

I am guessing that means existing parallel connections, not new
connections per second (cps), the kind of server box I was referring
to can easily sustain millions of those, given enough
memory for the tables (The last number I remember was <300byte per
connection in the conntrack table + space for entries into the routing
cache for each different IP). Slabtop is your friend here.

What matters most is
what happens in each time slice, not so much how many connections
you have in the connection hash table (you can tune that table with
with /proc/sys/net/ipv4/netfilter/ip_conntrack_max
and /sys/module/ip_conntrack/parameters/hashsize).

> How does that compare to the metrics that you mentioned earlier?

Well, any Switch/Router with SNMP support allows you to track bytes and
packets per second, so you could collect some data on the current
situation with that (www.cacti.net is a very nice tool).

As for new connections per second, once you have the iptables box
running you can get this info with lnstat -f ip_conntrack/column new.

If you have a reasonably good switch/router in the datapath, you could
also use port mirroring to get a copy of the data stream and then
count all tcp/syn packets to port 25 to give you a rough idea
about the number of connections per second.

However, emails per time should be pretty much the same as connections
per time, unless you open several tcp connections over the nat box
for each email, and I see no reason why you would need to do that ;)

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-02 12:31       ` Thomas Jacob
@ 2009-10-02 13:50         ` John Little
  2009-10-02 14:52           ` Thomas Jacob
  2009-10-02 15:08           ` Michele Petrazzo - Unipex
  0 siblings, 2 replies; 11+ messages in thread
From: John Little @ 2009-10-02 13:50 UTC (permalink / raw)
  To: Thomas Jacob; +Cc: Gáspár Lajos, netfilter



----- Original Message ----
> From: Thomas Jacob <jacob@internet24.de>
> To: John Little <jlittle_97@yahoo.com>
> Cc: Gáspár Lajos <swifty@freemail.hu>; netfilter@vger.kernel.org
> Sent: Friday, October 2, 2009 8:31:16 AM
> Subject: Re: Using iptables with high volume mail
> 
> > @thomas Thanks for those metrics.  We are looking to see if the connections 
> per second is
> > generated with our current devices.  What we do know is that our max
> >outbound connections will get as high as 16000 for a period of time >
> >(maybe 2-4 hours) and will occasionally burst up to around 20000.
> 
> I am guessing that means existing parallel connections, not new
> connections per second (cps), the kind of server box I was referring
> to can easily sustain millions of those, given enough
> memory for the tables (The last number I remember was <300byte per
> connection in the conntrack table + space for entries into the routing
> cache for each different IP). Slabtop is your friend here.
> 
> What matters most is
> what happens in each time slice, not so much how many connections
> you have in the connection hash table (you can tune that table with
> with /proc/sys/net/ipv4/netfilter/ip_conntrack_max
> and /sys/module/ip_conntrack/parameters/hashsize).
> 
> > How does that compare to the metrics that you mentioned earlier?
> 
> Well, any Switch/Router with SNMP support allows you to track bytes and
> packets per second, so you could collect some data on the current
> situation with that (www.cacti.net is a very nice tool).
> 
> As for new connections per second, once you have the iptables box
> running you can get this info with lnstat -f ip_conntrack/column new.
> 
> If you have a reasonably good switch/router in the datapath, you could
> also use port mirroring to get a copy of the data stream and then
> count all tcp/syn packets to port 25 to give you a rough idea
> about the number of connections per second.
> 
> However, emails per time should be pretty much the same as connections
> per time, unless you open several tcp connections over the nat box
> for each email, and I see no reason why you would need to do that ;)


Ok thanks.

We have some stats now:

Packets per second:  avg 6221 max 41,810
 
Connections peak: avg 7263  max 22,981
 
New connections per second: avg 102 max 1029 

Given your numbers of 8000 cps and the above comments it would seem that we are well within any types of overload issues with any decent off the shelf server equipped with two dual core CPUs and the necessary memory.  If I allocate 500 bytes per connection at the max connections I would need ~87Mb + machine overhead.  That's not much in today's world of servers.

Am I looking at this properly?

Thanks,
John



      

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-02 13:50         ` John Little
@ 2009-10-02 14:52           ` Thomas Jacob
  2009-10-02 15:08           ` Michele Petrazzo - Unipex
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Jacob @ 2009-10-02 14:52 UTC (permalink / raw)
  To: John Little; +Cc: Gáspár Lajos, netfilter

[-- Attachment #1: Type: text/plain, Size: 3183 bytes --]

> Given your numbers of 8000 cps and the above comments it would seem
that we are well 
> within any types of overload issues with any decent off the shelf
> server equipped with two dual core CPUs and the necessary memory.
> If I allocate 500 bytes per connection at the max connections I would
> need ~87Mb + machine overhead.  That's not much in today's world of
> servers.

I would say so, unless someone on the list says NAT has completely
different performance requirements from the connection tracking only
machines. But I did do some tests to find the breaking points of
such machines some time ago (see below) and there should
be plenty of resources left for any additional NAT requirements,
given your numbers.

As for memory, we are using 4GB RAM on our high performance machines
access throughput/latency is important here) with a 2GB Kernel / 2G Userspace-Setup
in order to allow for huge firewall rulesets and to have Linx
user larger default sizes for  various network caches (without us having
to fiddle with the setting).

    Thomas

=== old test results ====

[..]

What we did is run a

system A)

CPU Intel® XEON(TM) E3110 3000MHz 6MB FSB1333 S775
2x RAM DDR2 2GB PC667 Kingston ECC
NET INTEL Pro1000PT 1GBit 2xRJ45 NIC Dual Server
MBI SuperMicro X7SBi 
        Intel® 3210 + ICH9R Chipset. 
        Intel® 82573V + Intel® 82573L
           PCI-E Gigabit Controllers

against

system B)

CPU AMD Opteron 2220 2,8 GHz DualCore Socket F
4XRAM DDr2 1GB PC667 Kingston ECC-Reg CL5 with Parity
   Dual Rank + 2 DDR2 1GB / ECC / CL5 / 667MHz / with Parity / Dual Rank
NET INTEL Pro1000PT 1GBit 2xRJ45 NIC Dual Server
MBA Tyan Thunder h2000m (S3992G3NR-RS) DUAL SKT F
  EATX

I was running pktgen both with generating
a single 64byte/packet UDP-Stream and with 8192 parallel
flows of flowlen 4 with randomization of dst/src ips
and ports (also UPD 64byte/packet) so that the number of conntrack
entries stabilized at almost 512k (most of them timing out
of course).

The result was that the Opteron-System is essential
as fast as the Xeon-System if you have just a
single flow, but for the second, more realistic
test case, the Xeon-System was faster by about 10-20%,
probably due to the much larger CPU-Cache.

RX/TX flow control was enabled, iptables and connection
tracking were loaded. Incoming
and outgoing interface had their smp_affinity set to
single CPU-Core each. Kernel was 2.6.23.14, e1000-drivers
version what was current in Feb 2008.

As a ruleset, I did have 2 chaintrees for 8192 IPs each,
for ingress and egress, each IP had 10 non-matching
rules associated  with it, but this ruleset
was only search for --state  NEW  of course... resulting
in about 13*2=26 chain jumps and (13+10)*2=46 matches per
NEW packet.
(I had ~32k chains and ~210k rules)

Unfortunately I only have the results for the Xeon-System,
the Opteron-Data got lost somehow ;-(

1 stream / default buffers

eth0:eth1  735kpps

500k streams / default buffers

eth0:eth1 254kpps

But those numbers are obviously not comparable to yours... so...

[..]

============= snip  ======

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-02 13:50         ` John Little
  2009-10-02 14:52           ` Thomas Jacob
@ 2009-10-02 15:08           ` Michele Petrazzo - Unipex
  2009-10-02 19:04             ` John Little
  1 sibling, 1 reply; 11+ messages in thread
From: Michele Petrazzo - Unipex @ 2009-10-02 15:08 UTC (permalink / raw)
  To: John Little; +Cc: Thomas Jacob, Gáspár Lajos, netfilter

John Little wrote:
>> What matters most is what happens in each time slice, not so much
>> how many connections you have in the connection hash table (you can
>> tune that table with with
>> /proc/sys/net/ipv4/netfilter/ip_conntrack_max and
>> /sys/module/ip_conntrack/parameters/hashsize).
>> 

Except that if the table fill up you'll see some "table full" kernel
messages and the connections will be refused!

>> However, emails per time should be pretty much the same as
>> connections per time, unless you open several tcp connections over
>> the nat box for each email, and I see no reason why you would need
>> to do that ;)
> 
> We have some stats now:
> Packets per second:  avg 6221 max 41,810
> Connections peak: avg 7263  max 22,981
> 
> New connections per second: avg 102 max 1029

> If I allocate 500 bytes per connection at the max
> connections I would need ~87Mb + machine overhead.  That's not much
> in today's world of servers.

Only for add some real numbers that hope pacify your heart, with the
lnstat tool (lnstat -f ip_conntrack) I have here:

100k entries into ip_conntrack, about 700 new conn/sec, 8k iptables
rules, 0.5/0.8 cpu load, with a xeon 4 cores and 2 gb ram and e1000e
card and no problems at all.
The unique think that I can say you to do with this cards it's to
_update_ the drivers with the last one found on the site because that on
the kernel vanilla aren't so stable and can (who say for sure?) *oops*
your server.

> Thanks, John
> 

Michele

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Using iptables with high volume mail
  2009-10-02 15:08           ` Michele Petrazzo - Unipex
@ 2009-10-02 19:04             ` John Little
  0 siblings, 0 replies; 11+ messages in thread
From: John Little @ 2009-10-02 19:04 UTC (permalink / raw)
  To: Michele Petrazzo - Unipex; +Cc: Thomas Jacob, Gáspár Lajos, netfilter



----- Original Message ----
> From: Michele Petrazzo - Unipex <michele.petrazzo@unipex.it>
> To: John Little <jlittle_97@yahoo.com>
> Cc: Thomas Jacob <jacob@internet24.de>; Gáspár Lajos <swifty@freemail.hu>; netfilter@vger.kernel.org
> Sent: Friday, October 2, 2009 11:08:13 AM
> Subject: Re: Using iptables with high volume mail
> 
> John Little wrote:
> >> What matters most is what happens in each time slice, not so much
> >> how many connections you have in the connection hash table (you can
> >> tune that table with with
> >> /proc/sys/net/ipv4/netfilter/ip_conntrack_max and
> >> /sys/module/ip_conntrack/parameters/hashsize).
> >> 
> 
> Except that if the table fill up you'll see some "table full" kernel
> messages and the connections will be refused!
> 
> >> However, emails per time should be pretty much the same as
> >> connections per time, unless you open several tcp connections over
> >> the nat box for each email, and I see no reason why you would need
> >> to do that ;)
> > 
> > We have some stats now:
> > Packets per second:  avg 6221 max 41,810
> > Connections peak: avg 7263  max 22,981
> > 
> > New connections per second: avg 102 max 1029
> 
> > If I allocate 500 bytes per connection at the max
> > connections I would need ~87Mb + machine overhead.  That's not much
> > in today's world of servers.
> 
> 
> Only for add some real numbers that hope pacify your heart, with the
> lnstat tool (lnstat -f ip_conntrack) I have here:
> 
> 100k entries into ip_conntrack, about 700 new conn/sec, 8k iptables
> rules, 0.5/0.8 cpu load, with a xeon 4 cores and 2 gb ram and e1000e
> card and no problems at all.
> The unique think that I can say you to do with this cards it's to
> _update_ the drivers with the last one found on the site because that on
> the kernel vanilla aren't so stable and can (who say for sure?) *oops*
> your server.
> 
> > Thanks, John
> > 
> 
> Michele
Thomas, Michele and Gaspar,
That is some good information.  From what I can see what you guys are doing is in the realm of where we are with our connections per second, machine sizes/resources etc.  Thank you all for your responses.  We are going to compile the information that we have learned with your help and present it to management.  It certainly looks like the way we want to go.

Thank you very much for your help!
John


      

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-10-02 19:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-01 11:42 Using iptables with high volume mail John Little
2009-10-01 11:54 ` Richard Horton
2009-10-01 12:45   ` John Little
2009-10-01 16:03 ` Thomas Jacob
2009-10-01 16:40   ` Gáspár Lajos
2009-10-01 19:39     ` John Little
2009-10-02 12:31       ` Thomas Jacob
2009-10-02 13:50         ` John Little
2009-10-02 14:52           ` Thomas Jacob
2009-10-02 15:08           ` Michele Petrazzo - Unipex
2009-10-02 19:04             ` John Little

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.