From: Steven Rostedt <srostedt@redhat.com>
To: Jesper Dangaard Brouer <hawk@comx.dk>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>,
Eric Dumazet <eric.dumazet@gmail.com>,
Alexander Duyck <alexander.h.duyck@intel.com>,
Stephen Hemminger <shemminger@vyatta.com>,
netfilter-devel <netfilter-devel@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Subject: Re: Possible regression: Packet drops during iptables calls
Date: Thu, 16 Dec 2010 09:20:44 -0500 [thread overview]
Message-ID: <1292509244.2733.224.camel@fedora> (raw)
In-Reply-To: <1292508266.31289.12.camel@firesoul.comx.local>
On Thu, 2010-12-16 at 15:04 +0100, Jesper Dangaard Brouer wrote:
>
> To do some further investigation into the unfortunate behavior of the
> iptables get_counters() function I started to use "ftrace". This is a
> really useful tool (thanks Steven Rostedt).
>
> # Select the tracer "function_graph"
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
>
> # Limit the number of function we look at:
> echo local_bh_\* > /sys/kernel/debug/tracing/set_ftrace_filter
> echo get_counters >> /sys/kernel/debug/tracing/set_ftrace_filter
>
> # Enable tracing while calling iptables
> cd /sys/kernel/debug/tracing
> echo 0 > trace
> echo 1 > tracing_enabled;
> taskset 1 iptables -vnL > /dev/null ;
> echo 0 > tracing_enabled
> cat trace | less
Just an fyi, you can do the above much easier with trace-cmd:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
# trace-cmd record -p function_graph -l 'local_bh_*' -l get_counters taskset 1 iptables -vnL > /dev/null
# trace-cmd report
-- Steve
>
>
> The reduced output:
>
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
> 2) 2.772 us | local_bh_disable();
> ....
> 0) 0.228 us | local_bh_enable();
> 0) | get_counters() {
> 0) 0.232 us | local_bh_disable();
> 0) 7.919 us | local_bh_enable();
> 0) ! 109467.1 us | }
> 0) 2.344 us | local_bh_disable();
>
>
> The output show that we spend no less that 100 ms with local BH
> disabled. So, no wonder that this causes packet drops in the NIC
> (attached to this CPU).
>
> My iptables rule set in question is also very large, it contains:
> Chains: 20929
> Rules: 81239
>
> The vmalloc size is approx 19 MB (19.820.544 bytes) (see
> /proc/vmallocinfo). Looking through vmallocinfo I realized that
> even-though I only have 16 CPUs, there is 32 allocated rulesets
> "xt_alloc_table_info" (for the filter table). Thus, I have approx
> 634MB iptables filter rules in the kernel, half of which is totally
> unused.
>
> Guess this is because we use: "for_each_possible_cpu" instead of
> "for_each_online_cpu". (Feel free to fix this, or point me to some
> documentation of this CPU hotplug stuff... I see we are missing
> get_cpu() and put_cpu() a lot of places).
>
>
> The GOOD NEWS, is that moving the local BH disable section into the
> "for_each_possible_cpu" fixed the problem with packet drops during
> iptables calls.
>
> I wanted to profile with ftrace on the new code, but I cannot get the
> measurement I want. Perhaps Steven or Acme can help?
>
> Now I want to measure the time used between the local_bh_disable() and
> local_bh_enable, within the loop. I cannot figure out howto do that?
> The new trace looks almost the same as before, just a lot of
> local_bh_* inside the get_counters() function call.
>
> Guess is that the time spend is: 100 ms / 32 = 3.125 ms.
>
> Now I just need to calculate, how large a NIC buffer I need to buffer
> 3.125 ms at 1Gbit/s.
>
> 3.125 ms * 1Gbit/s = 390625 bytes
>
> Can this be correct?
>
> How much buffer does each queue have in the 82576 NIC?
> (Hope Alexander Duyck can answer this one?)
>
prev parent reply other threads:[~2010-12-16 14:21 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-14 14:46 Possible regression: Packet drops during iptables calls Jesper Dangaard Brouer
2010-12-14 15:31 ` Eric Dumazet
2010-12-14 16:09 ` Jesper Dangaard Brouer
2010-12-14 16:24 ` Eric Dumazet
2010-12-16 14:04 ` Jesper Dangaard Brouer
2010-12-16 14:12 ` Eric Dumazet
2010-12-16 14:24 ` Jesper Dangaard Brouer
2010-12-16 14:29 ` Eric Dumazet
2010-12-16 15:02 ` Eric Dumazet
2010-12-16 16:07 ` [PATCH net-next-2.6] netfilter: ip_tables: dont block BH while reading counters Eric Dumazet
2010-12-16 16:53 ` [PATCH v2 " Eric Dumazet
2010-12-16 17:31 ` Stephen Hemminger
2010-12-16 17:53 ` [PATCH v3 net-next-2.6] netfilter: x_tables: " Eric Dumazet
2010-12-16 17:57 ` Stephen Hemminger
2010-12-16 19:58 ` Eric Dumazet
2010-12-16 20:12 ` Stephen Hemminger
2010-12-16 20:40 ` Eric Dumazet
2010-12-16 17:57 ` Stephen Hemminger
2010-12-18 4:29 ` [PATCH v4 " Eric Dumazet
2010-12-20 13:42 ` Jesper Dangaard Brouer
2010-12-20 14:45 ` Eric Dumazet
2010-12-21 16:48 ` Jesper Dangaard Brouer
2011-01-08 16:45 ` Eric Dumazet
2011-01-09 21:31 ` Pablo Neira Ayuso
2010-12-16 14:13 ` Possible regression: Packet drops during iptables calls Eric Dumazet
2010-12-16 14:20 ` Steven Rostedt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1292509244.2733.224.camel@fedora \
--to=srostedt@redhat.com \
--cc=acme@infradead.org \
--cc=alexander.h.duyck@intel.com \
--cc=eric.dumazet@gmail.com \
--cc=hawk@comx.dk \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=peter.p.waskiewicz.jr@intel.com \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.