All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Savchenko <bircoph@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: [BUG] Kernel recieves DNS reply, but doesn't deliver it to a waiting application
Date: Sun, 23 Dec 2012 15:06:27 +0400	[thread overview]
Message-ID: <20121223150627.d7ebcf6a.bircoph@gmail.com> (raw)
In-Reply-To: <20121212122716.1e71f644.bircoph@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 7633 bytes --]

Hello,

the bug has struck again on 3.7.0, see details below.

On Wed, 12 Dec 2012 12:27:16 +0400 Andrew Savchenko wrote:
> [...]
> > > Some driver or protocol stack is messing with skb->truesize, as
> > > your /proc/net/udp file contains anomalies :
> > > 
> > > $ cat /proc/net/udp
> > >   sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops
> > > ...
> > >   323: 074A070A:007B 00000000:0000 07 FFFDF700:00000000 00:00000000 00000000   123        0 254469 2 ffff88003d581880 0
> > > ...
> > >   323: 00FCA8C0:007B 00000000:0000 07 FFFFF900:00000000 00:00000000 00000000     0        0 5187 2 ffff880039993880 0
> > > 
> > > Its clearly not possible to get tx_queue = 0xFFFDF700 or 0xFFFFF900
> > > 
> > > So what drivers handle following IP addresses : 192.168.252.0 , 10.7.74.7  ?
> > 
> > 192.168.252.0 is handled by eth0 interface running on Realtek
> > Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (10ec:8139) NIC.
> > Kernel driver 8139too. This interface handles multiple subnetworks:
> > 
> > # ip addr show eth0
> > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 
> > link/ether 00:80:48:30:ca:f3 brd ff:ff:ff:ff:ff:ff
> > inet 10.51.15.126/25 brd 10.51.15.127 scope global eth0
> > inet 192.168.252.0/31 scope global eth0
> > 
> > 10.7.74.7 is an l2tp connection handled by ppp over l2tp:
> > CONFIG_PPPOL2TP=y
> > It is running on top of eth0 described above.
> > 
> > # ip addr show ppp0
> > 65: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast state UNKNOWN qlen 3
> > link/ppp 
> > inet 10.7.74.7 peer 10.7.2.18/32 scope global ppp0
> 
> I updated kernel on this system to 3.7.0 and udp anomaly is still
> present:
> 
> $ cat /proc/net/udp
>   sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops             
>     0: 00000000:06A5 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5326 2 ffff88003dbf0a80 0          
>     8: 00000000:7EAD 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5157 2 ffff8800398c2000 0          
>    89: 00000000:90FE 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5101 2 ffff88003dbd3500 0          
>   160: 0100007F:2745 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4598 2 ffff88003d612700 0          
>   184: 0100007F:035D 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4774 2 ffff88003d612a80 0          
>   217: 00000000:857E 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5195 2 ffff8800398c2700 0          
>   318: 00000000:A9E3 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4782 2 ffff88003d612e00 0          
>   335: 7E0F330A:01F4 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5303 2 ffff8800398c2e00 0          
>   348: 00000000:0801 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5186 2 ffff8800398c2380 0          
>   387: 7E0F330A:DE28 1400320A:06A5 01 00000000:00000000 00:00000000 00000000     0        0 5332 4 ffff88003dbf0e00 0          
>   400: 010013AC:0035 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4842 2 ffff88003d613880 0          
>   400: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4841 2 ffff88003d613500 0          
>   414: 00000000:0043 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5273 2 ffff8800398c2a80 0          
>   458: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4483 2 ffff88003d612000 0          
>   459: 00000000:0270 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4507 2 ffff88003d612380 0          
>   466: 00000000:0277 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4802 2 ffff88003d613180 0          
>   470: 076A070A:007B 00000000:0000 07 FFFF4600:00000000 00:00000000 00000000   123        0 5552 2 ffff880039974380 0          
>   470: 010213AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4986 2 ffff88003dbd3180 0          
>   470: 010013AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4985 2 ffff88003dbd2e00 0          
>   470: 00FCA8C0:007B 00000000:0000 07 FFFFFB00:00000000 00:00000000 00000000     0        0 4984 2 ffff88003dbd2a80 0          
>   470: 7E0F330A:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4983 2 ffff88003dbd2700 0          
>   470: 0100007F:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4982 2 ffff88003dbd2380 0          
>   470: 00000000:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4975 2 ffff88003d613c00 0          
>   484: FF0013AC:0089 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5316 2 ffff88003dbf0000 0          
>   484: 010013AC:0089 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5315 2 ffff88003dbd3880 0          
>   484: FF0213AC:0089 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5312 2 ffff8800398c3c00 0          
>   484: 010213AC:0089 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5311 2 ffff8800398c3880 0          
>   484: 00000000:0089 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5308 2 ffff8800398c3180 0          
>   485: FF0013AC:008A 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5318 2 ffff88003dbf0700 0          
>   485: 010013AC:008A 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5317 2 ffff88003dbf0380 0          
>   485: FF0213AC:008A 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5314 2 ffff88003dbd3c00 0          
>   485: 010213AC:008A 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5313 2 ffff88003dbd2000 0          
>   485: 00000000:008A 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 5309 2 ffff8800398c3500 0
> 
> The bug hasn't shown up yet, I'll need to wait for about a week to see
> if it is reproducible.

I hit this bug again on uptime 11 days on 3.7.0 vanilla kernel.
See kernel config, /prot/net/upd, netstat -s and dropwatch logs
attached to this mail. This bug happens on UDP DNS requests only,
TCP requests work fine, see dig.log attached.

Increasing of net.ipv4.udp_mem from
24150        32201   48300
to
100000       150000  200000
helps, but I'm afraid only temporary again.

Dropwatch data was collected in the following way:
- dropwatch.bug.* files contain data obtained after bug occurred;
- dropwatch.*.background files contain background data when no
  host or dig test was running; this system has active firewall
  and complicated routing, ipv6 disabled via sysctl, etc, so some
  drops are normal;
- dropwatch.*.host.request shows dropped packets recorded during
  host ya.ru request; of course, during this time some background
  packets were recorded as well (dropwatch doesn't support filtering
  at this moment);
- dropwatch.nobug.* data was collected after the bug was
  workarounded via net.ipv4.upd_mem as described above.

As can be seen from dropwatch logs, drop in __udp_queue_rcv_skb+61
happens only on host request on bug conditions, thus something is
wrong there.

Best regards,
Andrew Savchenko

[-- Attachment #1.2: kernel-udp-bug.tar.xz --]
[-- Type: application/octet-stream, Size: 19320 bytes --]

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2012-12-23 11:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-03 19:25 Kernel recieves DNS reply, but doesn't deliver it to a waiting application Andrew Savchenko
2012-10-13 12:36 ` [BUG] " Andrew Savchenko
2012-10-13 13:44   ` Eric Dumazet
2012-10-13 23:11     ` Andrew Savchenko
2012-10-20 23:25       ` Andrew Savchenko
2012-10-21 12:52         ` Eric Dumazet
2012-10-22  3:36           ` Andrew Savchenko
2012-10-22  6:48             ` Eric Dumazet
2012-10-22 21:27               ` Andrew Savchenko
2012-12-12  8:27                 ` Andrew Savchenko
2012-12-23 11:06                   ` Andrew Savchenko [this message]
2012-12-28 18:11                     ` Eric Dumazet
2013-01-16 16:36                       ` Andrew Savchenko
2013-02-04 13:39                         ` Andrew Savchenko
2013-02-04 15:21                           ` Eric Dumazet
2012-11-23  7:45         ` Andrew Savchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121223150627.d7ebcf6a.bircoph@gmail.com \
    --to=bircoph@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.