netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: linux-kernel@vger.kernel.org,
	netfilter-devel@lists.netfilter.org, netdev@vger.kernel.org
Cc: Andi Kleen <ak@suse.de>
Subject: [PATCH 0/3] netfilter : 3 patches to boost ip_tables performance
Date: Wed, 21 Sep 2005 23:24:55 +0200	[thread overview]
Message-ID: <4331CFA7.50104@cosmosbay.com> (raw)
In-Reply-To: <43308324.70403@cosmosbay.com>

Hi

I have reworked net/ipv4/ip_tables.c to boost its performance, and post three 
patches.

In oprofile results, ipt_do_table() was at the first position.
It is now at 6th position, using 1/3 of the CPU it was using before.
(Tests done on a dual Xeon i386 and a dual Opteron x86_64)

Short description :

1) No more central rwlock protecting each table (filter, nat, mangle, raw),
    but one lock per CPU. It avoids cache line ping pongs for each packet.

2) Loop unrollings and various code optimizations to reduce branches
    mispredictions.

3) NUMA aware allocation of memory (this part was posted earlier, but got some 
polishing problems)


Long description:

1) No more one rwlock_t protecting the 'curtain'

One major bottleneck on SMP machines is the fact that one rwlock
is taken in ipt_do_table() entry and exit. That 2 atomic operations are
the killer, and even if multiple readers can work together on the same table
(using read_lock_bh()), the cache line containing rwlock still must be
taken exclusively by each CPU at entry/exit of ipt_do_table().

As existing code already use separate copies of the data for each cpu, it is
very easy to convert the central rwlock to separate rwlocks, allocated
percpu (and NUMA aware).

When a cpu enters ipt_do_table(), it can locks its local copy, touching a
cache line that is not used by other cpus. Further operations are done on
'local' copy of the data.

When a sum of all counters must be done, we can write_lock each part at a
time, instead of locking all the parts, reducing the lock contention.

2) Loop unrolling

It seems that with current compilers and CFLAGS, the code from
ip_packet_match() is very bad, using lot of mispredicted conditional branches 
I made some patches and generated code on i386 and x86_64
is much better.

3) NUMA allocation.

Part of the performance problem we have with netfilter is memory allocation
is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
separate cache lines)

Even with small iptables rules, the cost of this misplacement can be high
  on common workloads.

Instead of using one vmalloc() area (located in the node of the iptables
process), we now allocate an area for each possible CPU, using NUMA policy
(MPOL_PREFERRED) so that memory should be allocated in the CPU's node
if possible.

If the size of ipt_table is small enough (less than one page), we use
kmalloc_node() instead of vmalloc(), to use less memory and less TLB entries)
in small setups.

Note1 : I also optimized get_counters(), using a SET_COUNTER() for the first 
cpu, avoiding a memset() and ADD_COUNTER() if SMP on other cpus.

Note2 : This patch depends on another patch that declares sys_set_mempolicy()
in include/linux/syscalls.h
( http://marc.theaimsgroup.com/?l=linux-kernel&m=112725288622984&w=2 )


Thank you
Eric Dumazet

  reply	other threads:[~2005-09-21 21:24 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-19 17:09 [PATCH, netfilter] NUMA aware ipv4/netfilter/ip_tables.c Eric dumazet
2005-09-19 17:20 ` Eric Dumazet
2005-09-19 17:48 ` Andi Kleen
2005-09-19 19:09   ` Eric Dumazet
2005-09-20  9:47   ` Eric Dumazet
2005-09-20 16:30     ` Andi Kleen
2005-09-20 17:02       ` Eric Dumazet
2005-09-20 21:45       ` [PATCH] Adds sys_set_mempolicy() in include/linux/syscalls.h , " Eric Dumazet
2005-09-20 21:46         ` [PATCH] Adds sys_set_mempolicy() in include/linux/syscalls.h Eric Dumazet
2005-09-21 21:24           ` Eric Dumazet [this message]
2005-09-21 22:43             ` [PATCH 0/3] netfilter : 3 patches to boost ip_tables performance Christoph Lameter
2005-09-22  0:34               ` David S. Miller
2005-09-22  1:44                 ` Christoph Lameter
2005-09-22 12:11                   ` Eric Dumazet
2005-09-22 12:49                     ` Christoph Hellwig
2005-09-22 12:54                       ` Andi Kleen
2005-09-22 12:58                         ` Christoph Hellwig
2005-09-22 13:05                           ` Andi Kleen
2005-09-22 15:37                             ` Christoph Lameter
2005-09-22 15:50                               ` Eric Dumazet
2005-09-22 15:55                                 ` Christoph Lameter
2005-09-23 17:11                                 ` Harald Welte
2005-09-23 17:44                                   ` Christoph Lameter
2005-09-23 18:04                                     ` Dave Hansen
2005-09-23 17:47                                   ` Eric Dumazet
2005-09-23 18:00                                     ` Kyle Moffett
2005-09-22  4:18             ` James Morris
2005-09-22  5:07               ` Eric Dumazet
2005-09-22 13:03             ` Andi Kleen
2005-09-22 13:30               ` Eric Dumazet
2005-09-23 17:09               ` Harald Welte
2005-09-27 16:23                 ` Andi Kleen
2005-09-28  0:25                   ` Henrik Nordstrom
2005-09-28  8:32                     ` Harald Welte
2005-09-28  8:37                       ` Andi Kleen
2005-10-04 17:01                         ` Patrick McHardy
2005-10-05 16:53                           ` Andi Kleen
2005-10-07  2:38                             ` Harald Welte
2005-10-06 17:59                               ` Andi Kleen
2005-10-07 17:08                                 ` Patrick McHardy
2005-10-07 17:21                                   ` Andi Kleen
2005-10-07 17:50                                     ` Patrick McHardy
2005-09-28 10:34                       ` Henrik Nordstrom
2005-11-25 11:23             ` [PATCH] netfilter : zap get_cpu()/put_cpu() calls from ip_tables Eric Dumazet
2005-11-25 11:28               ` [PATCH (resent with the attachment !)] " Eric Dumazet
2005-11-25 18:20                 ` Patrick McHardy
2005-09-21 21:29           ` [PATCH 1/3] netfilter : 3 patches to boost ip_tables performance Eric Dumazet
2005-09-22 12:57             ` Harald Welte
2005-09-22 13:17               ` Eric Dumazet
2005-09-21 21:32           ` [PATCH 2/3] " Eric Dumazet
2005-09-22 12:48             ` Harald Welte
2005-09-22 13:05               ` Eric Dumazet
2005-09-23  4:02                 ` Willy Tarreau
2005-09-23  5:14                   ` Eric Dumazet
2005-09-23 11:33                     ` Willy Tarreau
2005-09-23 14:00                   ` Tim Mattox
2005-09-21 21:37           ` [PATCH 3/3] " Eric Dumazet
2005-09-22 12:50             ` Harald Welte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4331CFA7.50104@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@lists.netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).