All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Eric Dumazet <edumazet@google.com>
Cc: Eric Desrochers <ericd@miromedia.ca>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org
Subject: Re: netfilter question
Date: Wed, 16 Nov 2016 16:02:00 +0100	[thread overview]
Message-ID: <20161116150200.GH30581@breakpoint.cc> (raw)
In-Reply-To: <CANn89iLr-Zf2W5MW47QtKo17915UoryZentihKt1SMv7NP_dOw@mail.gmail.com>

Eric Dumazet <edumazet@google.com> wrote:
> On Wed, Nov 16, 2016 at 2:22 AM, Eric Desrochers <ericd@miromedia.ca> wrote:
> > Hi Eric,
> >
> > My name is Eric and I'm reaching you today as I found your name in multiple netfilter kernel commits, and was hoping we can discuss about a potential regression.
> >
> > I identified (git bisect) this commit [https://github.com/torvalds/linux/commit/71ae0dff02d756e4d2ca710b79f2ff5390029a5f] as the one introducing a serious performance slowdown when using the binary ip/ip6tables with a large number of policies.
> >
> > I also tried with the latest and greatest v4.9-rc4 mainline kernel, and the slowdown is still present.
> >
> > So even commit [https://github.com/torvalds/linux/commit/a1a56aaa0735c7821e85c37268a7c12e132415cb] which introduce a 16 bytes alignment on xt_counter percpu allocations so that bytes and packets sit in same cache line, doesn't have impact too.
> >
> >
> > Everything I found is detailed in the following bug : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1640786
> >
> > Of course, I'm totally aware that "iptables-restore" should be the favorite choice as it is way more efficient (note that using iptables-restore doesn't exhibit the problem) but some folks still rely on ip/ip6tables and might face this performance slowdown.
> >
> > I found the problem today, I will continue to investigate on my side, but I was wondering if we could have a discussion about this subject.
> >
> > Thanks in advance.

[..]

> Key point is that we really care about fast path : packet processing.
> And cited commit helps this a lot by lowering the memory foot print on
> hosts with many cores.
> This is a step into right direction.
> 
> Now we probably should batch the percpu allocations one page at a
> time, or ask Tejun if percpu allocations could be really really fast
> (probably much harder)
> 
> But really you should not use iptables one rule at a time...
> This will never compete with iptables-restore. ;)
> 
> Florian, would you have time to work on a patch trying to group the
> percpu allocations one page at a time ?

You mean something like this ? :
        xt_entry_foreach(iter, entry0, newinfo->size) {
-               ret = find_check_entry(iter, net, repl->name, repl->size);
-               if (ret != 0)
+               if (pcpu_alloc == 0) {
+                       pcnt = __alloc_percpu(PAGE_SIZE, sizeof(struct xt_counters));
+                       if (IS_ERR_VALUE(pcnt))
+                               BUG();
+               }
+
+               iter->counters.pcnt = pcnt + pcpu_alloc;
+               iter->counters.bcnt = !!pcpu_alloc;
+               pcpu_alloc += sizeof(struct xt_counters);
+
+               if (pcpu_alloc > PAGE_SIZE - sizeof(struct xt_counters))
+                       pcpu_alloc = 0;
+
+               ret = find_check_entry(iter, net, repl->name, repl->size)
 ...

This is going to be ugly since we'll have to deal with SMP vs. NONSMP (i.e. no perpcu allocations)
in ip/ip6/arptables.

Error unwind will also be a mess (we can abuse .bcnt to tell if pcpu offset should be free'd or not).

But maybe I don't understand what you are suggesting :)
Can you elaborate?

Thanks!

  reply	other threads:[~2016-11-16 15:04 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cad49557-7c7a-83c9-d2b6-71d9624f0d52@miromedia.ca>
2016-11-16 13:33 ` netfilter question Eric Dumazet
2016-11-16 15:02   ` Florian Westphal [this message]
2016-11-16 15:23     ` Eric Dumazet
2016-11-17  0:07       ` Florian Westphal
2016-11-17  2:34         ` Eric Dumazet
2016-11-17 15:49         ` Eric Desrochers
2016-11-20  6:33         ` Eric Dumazet
     [not found]           ` <CAGUFhKwQTRRJpfGi2fRkFfGdpLYMN-2F9G+dEsavM7UGbkjjdA@mail.gmail.com>
2016-11-20 17:31             ` Eric Dumazet
2016-11-20 17:55               ` Eric Dumazet
2012-12-10 20:12 Sri Ram Vemulpali
2012-12-10 20:12 ` Sri Ram Vemulpali
  -- strict thread matches above, loose matches on Subject: below --
2005-02-08  7:50 netfilter & ipv6 Jonas Berlin
     [not found] ` <53965.213.236.112.75.1107867276.squirrel@213.236.112.75>
2005-02-10 23:15   ` ULOG target for ipv6 Jonas Berlin
2005-02-11 22:10     ` netfilter question Pedro Fortuna
2004-02-19 20:25 John Black
2004-02-19 21:22 ` Antony Stone
2004-02-19 16:56 John Black
2004-02-19 16:23 John Black
2004-02-19 17:06 ` Antony Stone
2004-02-19 16:00 John Black
2004-02-19 14:13 John Black
2004-02-19 14:51 ` Alexis
2004-02-19 13:38 John Black
2004-02-19 14:18 ` Antony Stone
2004-02-19  3:32 John Black
2004-02-19  8:19 ` Klemen Kecman
2004-02-19  9:22   ` Antony Stone
2004-02-19 13:06   ` John Black
2004-02-19 13:17     ` Antony Stone
2001-10-24 13:09 Netfilter Question Shiva Raman Pandey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161116150200.GH30581@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=edumazet@google.com \
    --cc=ericd@miromedia.ca \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.