From: Eric Dumazet <eric.dumazet@gmail.com>
To: Sridhar Samudrala <sri@us.ibm.com>
Cc: David Miller <davem@davemloft.net>,
cl@linux-foundation.org, dlstevens@us.ibm.com,
netdev@vger.kernel.org, niv@linux.vnet.ibm.com,
mtk.manpages@gmail.com
Subject: Re: [PATCH net-next-2.6] ip: Report qdisc packet drops
Date: Wed, 02 Sep 2009 18:20:48 +0200 [thread overview]
Message-ID: <4A9E9B60.8010408@gmail.com> (raw)
In-Reply-To: <1251907915.6468.15.camel@w-sridhar.beaverton.ibm.com>
Sridhar Samudrala a écrit :
> On Wed, 2009-09-02 at 16:43 +0200, Eric Dumazet wrote:
>> David Miller a écrit :
>>> From: Eric Dumazet <eric.dumazet@gmail.com>
>>> Date: Mon, 31 Aug 2009 14:09:50 +0200
>>>
>>>> Re-reading again this stuff, I realized ip6_push_pending_frames()
>>>> was not updating IPSTATS_MIB_OUTDISCARDS, even if IP_RECVERR was set.
>>>>
>>>> May I suggest following path :
>>>>
>>>> 1) Correct ip6_push_pending_frames() to properly
>>>> account for dropped-by-qdisc frames when IP_RECVERR is set
>>> Your patch is applied to net-next-2.6, thanks!
>>>
>>>> 2) Submit a patch to account for qdisc-dropped frames in SNMP counters
>>>> but still return a OK to user application, to not break them ?
>>> Sounds good.
>>>
>>> I think if you sample random UDP applications, you will find that such
>>> errors will upset them terribly, make them log tons of crap to
>>> /var/log/messages et al., and consume tons of CPU.
>>>
>>> And in such cases silent ignoring of drops is entirely appropriate and
>>> optimal, which supports our current behavior.
>>>
>>> If we are to make such applications "more sophisticated" such
>>> converted apps can be indicated simply their use of IP_RECVERR.
>>>
>>> If you want to be notified of all asynchronous errors we can detect,
>>> you use this, end of story. It is the only way to handle this
>>> situation without breaking the world.
>>>
>>> As usual, Alexey Kuznetsov's analysis of this situation is timeless,
>>> accurate, and wise. And he understood all of this 10+ years ago.
>> Thanks David, here is the 2nd patch then :
>>
>>
>> [PATCH net-next-2.6] ip: Report qdisc packet drops
>>
>> Christoph Lameter pointed out that packet drops at qdisc level where not
>> accounted in SNMP counters. Only if application sets IP_RECVERR, drops
>> are reported to user (-ENOBUFS errors) and SNMP counters updated.
>>
>> IP_RECVERR is used to enable extended reliable error message passing,
>> but these are not needed to update system wide SNMP stats.
>>
>> This patch changes things a bit to allow SNMP counters to be updated,
>> regardless of IP_RECVERR being set or not on the socket.
>>
>> Example after an UDP tx flood
>> # netstat -s
>> ...
>> IP:
>> 1487048 outgoing packets dropped
>> ...
>> Udp:
>> ...
>> SndbufErrors: 1487048
>>
>
> Didn't we agree that qdisc drops should not be counted as IP or UDP
> drops as David Stevens pointed out?
> I would say that even when IP_RECVERR is set, SNMP counters at IP and
> UDP should not be counted when a packet is dropped at qdisc level,
> but the error can be reported to user.
>
> Now that qdisc stats issue is figured out and they can be accounted
> and seen at qdisc level, doesn't it confuse if we count the same drop
> at IP, UDP and qdisc level?
>
> Thanks
> Sridhar
>
Yes, I am aware of David point, but its already not true with current kernel.
Current kernels and an UDP frame sent by application :
if IP_RECVERR not set, no SNMP error logged, IP or UDP level
if IP_RECVERR is set, qdisc drops are reported both to IP and UDP
SNMP counters.
udp_sendmsg()
{
...
out:
ip_rt_put(rt);
if (free)
kfree(ipc.opt);
if (!err)
return len;
/*
* ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space. Reporting
* ENOBUFS might not be good (it's not tunable per se), but otherwise
* we don't have a good statistic (IpOutDiscards but it can be too many
* things). We could add another new stat but at least for now that
* seems like overkill.
*/
if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) {
UDP_INC_STATS_USER(sock_net(sk),
UDP_MIB_SNDBUFERRORS, is_udplite);
}
return err;
...
}
So what shall we do ?
IMHO, one should not add MIB counters for different domains (IP / UDP), this
makes no sense.
next prev parent reply other threads:[~2009-09-02 16:21 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-17 20:01 UDP multicast packet loss not reported if TX ring overrun? Christoph Lameter
2009-08-17 20:40 ` Nivedita Singhvi
2009-08-17 20:46 ` Christoph Lameter
2009-08-17 21:50 ` Sridhar Samudrala
2009-08-17 22:13 ` Christoph Lameter
2009-08-17 22:43 ` Sridhar Samudrala
2009-08-17 22:52 ` Christoph Lameter
2009-08-17 22:57 ` Christoph Lameter
2009-08-18 0:12 ` Sridhar Samudrala
2009-08-18 0:25 ` Christoph Lameter
2009-08-24 17:40 ` Christoph Lameter
2009-08-24 22:02 ` Eric Dumazet
2009-08-24 22:36 ` Sridhar Samudrala
2009-08-25 13:48 ` Christoph Lameter
2009-08-25 19:03 ` David Stevens
2009-08-25 19:08 ` David Miller
2009-08-25 19:15 ` Christoph Lameter
2009-08-25 19:56 ` Joe Perches
2009-08-25 22:35 ` Sridhar Samudrala
2009-08-26 14:08 ` Christoph Lameter
2009-08-26 14:22 ` Eric Dumazet
2009-08-26 15:27 ` Christoph Lameter
2009-08-26 16:29 ` Christoph Lameter
2009-08-26 17:50 ` Sridhar Samudrala
2009-08-26 19:09 ` Christoph Lameter
2009-08-26 22:11 ` Sridhar Samudrala
2009-08-27 15:40 ` Christoph Lameter
2009-08-27 20:23 ` Christoph Lameter
2009-08-28 13:53 ` Christoph Lameter
2009-08-28 15:07 ` Eric Dumazet
2009-08-28 16:15 ` Christoph Lameter
2009-08-28 17:26 ` [PATCH net-next-2.6] ip: Report qdisc packet drops Eric Dumazet
2009-08-29 6:38 ` David Miller
2009-08-31 12:09 ` Eric Dumazet
2009-09-02 1:41 ` David Miller
2009-09-02 14:43 ` Eric Dumazet
2009-09-02 16:11 ` Sridhar Samudrala
2009-09-02 16:20 ` Eric Dumazet [this message]
2009-09-02 19:37 ` Christoph Lameter
2009-09-02 16:05 ` Eric Dumazet
2009-09-02 22:26 ` Eric Dumazet
2009-09-03 1:05 ` David Miller
2009-09-03 17:57 ` Christoph Lameter
2009-09-03 14:12 ` Eric Dumazet
2009-09-03 18:35 ` Christoph Lameter
2009-09-02 18:22 ` Christoph Lameter
2009-09-03 1:09 ` David Miller
2009-08-28 19:26 ` UDP multicast packet loss not reported if TX ring overrun? David Miller
2009-08-28 20:00 ` Christoph Lameter
2009-08-28 20:04 ` David Miller
2009-08-28 19:24 ` David Miller
2009-08-28 19:53 ` Christoph Lameter
2009-08-28 20:03 ` David Miller
2009-08-28 20:09 ` Christoph Lameter
2009-08-30 0:21 ` Mark Smith
2009-08-25 13:46 ` Christoph Lameter
2009-08-25 13:48 ` Eric Dumazet
2009-08-25 14:00 ` Christoph Lameter
2009-08-25 15:32 ` Eric Dumazet
2009-08-25 15:35 ` Christoph Lameter
2009-08-25 15:58 ` Eric Dumazet
2009-08-25 16:11 ` Christoph Lameter
2009-08-25 16:27 ` Eric Dumazet
2009-08-25 16:36 ` Christoph Lameter
2009-08-25 16:48 ` Eric Dumazet
2009-08-25 17:01 ` Christoph Lameter
2009-08-25 17:08 ` Eric Dumazet
2009-08-25 17:44 ` Christoph Lameter
2009-08-25 17:53 ` Christoph Lameter
2009-08-25 18:38 ` Sridhar Samudrala
2009-08-24 23:14 ` Eric Dumazet
2009-08-25 6:46 ` Eric Dumazet
2009-08-25 13:45 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A9E9B60.8010408@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dlstevens@us.ibm.com \
--cc=mtk.manpages@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=niv@linux.vnet.ibm.com \
--cc=sri@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).