netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Eric Dumazet <edumazet@google.com>
Cc: Eric Desrochers <ericd@miromedia.ca>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org
Subject: Re: netfilter question
Date: Wed, 16 Nov 2016 16:02:00 +0100	[thread overview]
Message-ID: <20161116150200.GH30581@breakpoint.cc> (raw)
In-Reply-To: <CANn89iLr-Zf2W5MW47QtKo17915UoryZentihKt1SMv7NP_dOw@mail.gmail.com>

Eric Dumazet <edumazet@google.com> wrote:
> On Wed, Nov 16, 2016 at 2:22 AM, Eric Desrochers <ericd@miromedia.ca> wrote:
> > Hi Eric,
> >
> > My name is Eric and I'm reaching you today as I found your name in multiple netfilter kernel commits, and was hoping we can discuss about a potential regression.
> >
> > I identified (git bisect) this commit [https://github.com/torvalds/linux/commit/71ae0dff02d756e4d2ca710b79f2ff5390029a5f] as the one introducing a serious performance slowdown when using the binary ip/ip6tables with a large number of policies.
> >
> > I also tried with the latest and greatest v4.9-rc4 mainline kernel, and the slowdown is still present.
> >
> > So even commit [https://github.com/torvalds/linux/commit/a1a56aaa0735c7821e85c37268a7c12e132415cb] which introduce a 16 bytes alignment on xt_counter percpu allocations so that bytes and packets sit in same cache line, doesn't have impact too.
> >
> >
> > Everything I found is detailed in the following bug : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1640786
> >
> > Of course, I'm totally aware that "iptables-restore" should be the favorite choice as it is way more efficient (note that using iptables-restore doesn't exhibit the problem) but some folks still rely on ip/ip6tables and might face this performance slowdown.
> >
> > I found the problem today, I will continue to investigate on my side, but I was wondering if we could have a discussion about this subject.
> >
> > Thanks in advance.

[..]

> Key point is that we really care about fast path : packet processing.
> And cited commit helps this a lot by lowering the memory foot print on
> hosts with many cores.
> This is a step into right direction.
> 
> Now we probably should batch the percpu allocations one page at a
> time, or ask Tejun if percpu allocations could be really really fast
> (probably much harder)
> 
> But really you should not use iptables one rule at a time...
> This will never compete with iptables-restore. ;)
> 
> Florian, would you have time to work on a patch trying to group the
> percpu allocations one page at a time ?

You mean something like this ? :
        xt_entry_foreach(iter, entry0, newinfo->size) {
-               ret = find_check_entry(iter, net, repl->name, repl->size);
-               if (ret != 0)
+               if (pcpu_alloc == 0) {
+                       pcnt = __alloc_percpu(PAGE_SIZE, sizeof(struct xt_counters));
+                       if (IS_ERR_VALUE(pcnt))
+                               BUG();
+               }
+
+               iter->counters.pcnt = pcnt + pcpu_alloc;
+               iter->counters.bcnt = !!pcpu_alloc;
+               pcpu_alloc += sizeof(struct xt_counters);
+
+               if (pcpu_alloc > PAGE_SIZE - sizeof(struct xt_counters))
+                       pcpu_alloc = 0;
+
+               ret = find_check_entry(iter, net, repl->name, repl->size)
 ...

This is going to be ugly since we'll have to deal with SMP vs. NONSMP (i.e. no perpcu allocations)
in ip/ip6/arptables.

Error unwind will also be a mess (we can abuse .bcnt to tell if pcpu offset should be free'd or not).

But maybe I don't understand what you are suggesting :)
Can you elaborate?

Thanks!

  reply	other threads:[~2016-11-16 15:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cad49557-7c7a-83c9-d2b6-71d9624f0d52@miromedia.ca>
2016-11-16 13:33 ` netfilter question Eric Dumazet
2016-11-16 15:02   ` Florian Westphal [this message]
2016-11-16 15:23     ` Eric Dumazet
2016-11-17  0:07       ` Florian Westphal
2016-11-17  2:34         ` Eric Dumazet
2016-11-17 15:49         ` Eric Desrochers
2016-11-20  6:33         ` Eric Dumazet
     [not found]           ` <CAGUFhKwQTRRJpfGi2fRkFfGdpLYMN-2F9G+dEsavM7UGbkjjdA@mail.gmail.com>
2016-11-20 17:31             ` Eric Dumazet
2016-11-20 17:55               ` Eric Dumazet
2005-02-08  7:50 netfilter & ipv6 Jonas Berlin
     [not found] ` <53965.213.236.112.75.1107867276.squirrel@213.236.112.75>
2005-02-10 23:15   ` ULOG target for ipv6 Jonas Berlin
2005-02-11 22:10     ` netfilter question Pedro Fortuna

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161116150200.GH30581@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=edumazet@google.com \
    --cc=ericd@miromedia.ca \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).