netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bridge + Conntrack + SKB Recycle: Fragment Reassembly Errors
@ 2009-11-10 16:09 ben
  2009-11-10 16:50 ` Patrick McHardy
  0 siblings, 1 reply; 8+ messages in thread
From: ben @ 2009-11-10 16:09 UTC (permalink / raw)
  To: netdev

We have observed significant reassembly errors when combining
routing/bridging with conntrack + nf_defrag_ipv4 loaded, and
skb_recycle_check - enabled interfaces.  For our test, we had a single
linux device with two interfaces (gianfars in this case) with SKB
recycling enabled.  We sent large, continuous pings across the bridge,
like this:
ping -s 64000 -A <dest IP>

Then, we ran netstat -s --raw, and noticed that IPSTATS_MIB_REASMFAILS
were happening for about 40% of the received datagrams.  Tracing the
code in ip_fragment.c, we instrumented each of the
IPSTATS_MIB_REASMFAILS locations, and found the culprit to be
ip_evictor.  Nothing looked unusual here, so we placed tracing in
ip_frag_queue, directly above:
	atomic_add(skb->truesize, &qp->q.net->mem);

We noticed that quite a few of the skb->truesize numbers were in the 67K
range, which quickly overwhelms the default 192K-ish ipfrag_low_thresh.
This means that the next time inet_frag_evictor is run:
 work = atomic_read(&nf->mem) - nf->low_thresh;

Will surely be positive, and it is likely that our huge-frag-containing
queue will be one of those evicted. 

Looking at the source of these huge skbs, it seems that during
re-fragmentation in br_nf_dev_queue_xmit (which calls ip_fragment with
CONFIG_NF_CONNTRACK_IPV4 enabled), the huge datagram that was allocated
to hold a successfully-reassembled skb may be getting reused?  In any
case, when skb_recycle_check(skb, min_rx_size) is called, the huge
(skb->truesize huge, not data huge) skb is recycled for use on RX, and
it eventually gets enqueued for reassembly, causing the
inet_frag_evictor to have a positive work value.

Our solution was to add an upper-bounds check to skb_recycle_check,
which prevents the large-ish SKBs from being used to create future
frags, and overwhelming ipfrag_low_thresh.  This seems quite clunky,
although I would be happy to submit this as a patch...

If this is not the right place...what is the "right" place?

Ben Menchaca
Bigfoot Networks


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-12-01 23:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-10 16:09 Bridge + Conntrack + SKB Recycle: Fragment Reassembly Errors ben
2009-11-10 16:50 ` Patrick McHardy
2009-11-21 19:08   ` David Miller
2009-11-22  0:21     ` Patrick McHardy
2009-11-22  0:29       ` Patrick McHardy
2009-12-01 16:00         ` ben
2009-12-01 16:24           ` Patrick McHardy
2009-12-01 23:54             ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).