From: Timo Teras <timo.teras@iki.fi>
To: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org
Subject: Re: r8169 rx_missed increasing in bursts (regression)
Date: Wed, 9 Jan 2013 19:14:56 +0200 [thread overview]
Message-ID: <20130109191456.0888ac75@vostro> (raw)
In-Reply-To: <20130109115850.055b7a7e@vostro>
On Wed, 9 Jan 2013 11:58:50 +0200 Timo Teras <timo.teras@iki.fi> wrote:
> On Tue, 8 Jan 2013 23:58:33 +0100 Francois Romieu
> <romieu@fr.zoreil.com> wrote:
>
> > Timo Teras <timo.teras@iki.fi> :
> > [...]
> > > My current hypothesis is that due to high softirq and recent(ish)
> > > commit da78dbf "r8169: remove work from irq handler" moving more
> > > work to softirq makes the receive path now suffer from latency
> > > from getting irq to reading packets from the NIC on these boxes.
> > > And that at times the rx fifo can get full causing a missed
> > > packet or so.
> >
> > This hypothesis won't explain the regression in 3.3.8 since 3.3.x
> > does not include commit da78dbf.
> >
> > Do you notice any netdev watchdog message in dmesg ?
>
> In production boxes. No.
>
> The lab environment where we tried to reproduce this, we received:
> NOHZ: local_softirq_pending 08
>
> Which is likely related, but separate issue. And fixed by commit
> da78dbf. So seems that just got upgraded to "regression fix".
>
> > 'perf top' may exhibit something unusual too.
>
> Will try this.
>
> I did notice that:
> /proc/net/softnet_stat's 3rd field aka. softnet_data.time_squeeze
> keeps incrementing when ever rx_missed increases. Sometiems
> time_squeeze increments on it own. But rx_missed never increases
> without time_squeeze bumping up seriously too.
Did more general observing.
It seems that the rx_missed is not directly related to traffic amount.
At times the box is handling easily 10000+ pps, while packet loss can
happen at other times on 4000-8000pps levels.
Generally time_squeeze does not happen, and the box is at 20-30%
softirq. Some times time_squeeze bumps up with one (within a one second
interval) or two and packet loss does not happen.
When rx_missed is getting bumped, time_squeeze goes up with 1-3, and
rx_missed goes up with 50-1000 packets. Usually around 200 packets. (1
second sampling period)
I did find a strong correlation that rx_misses happen usually when the
box has dropped a packet due to iptables DROP/REJECT rule, or some
other reason (e.g. I'm seeing once in a while dmesg contain:
"nf_ct_sip: dropping packet").
Any ideas why a netfilter packet drop might cause netdevice rx to stall
long enough to saturate the hardware receive queue?
- Timo
next prev parent reply other threads:[~2013-01-09 17:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-08 8:28 r8169 rx_missed increasing in bursts (regression) Timo Teras
2013-01-08 22:58 ` Francois Romieu
2013-01-09 9:58 ` Timo Teras
2013-01-09 17:14 ` Timo Teras [this message]
2013-01-15 8:11 ` Timo Teras
2013-01-15 22:53 ` Francois Romieu
2013-01-16 7:01 ` [PATCH] r8169: remove unneeded dirty_rx index Timo Teräs
2013-01-16 21:25 ` David Miller
2013-01-16 21:26 ` Francois Romieu
2013-01-16 22:16 ` Francois Romieu
2013-01-16 23:02 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130109191456.0888ac75@vostro \
--to=timo.teras@iki.fi \
--cc=netdev@vger.kernel.org \
--cc=romieu@fr.zoreil.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.