netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Kernel recieves DNS reply, but doesn't deliver it to a waiting application
@ 2012-10-03 19:25 Andrew Savchenko
  2012-10-13 12:36 ` [BUG] " Andrew Savchenko
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Savchenko @ 2012-10-03 19:25 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

Hello,

I encountered a very weird bug: after a while of uptime kernel stops to deliver
DNS reply to applications. Tcpdump shows that correct reply is recieved, but 
strace shows inquiring application never recieves it and ends with timeout,
epoll_wait() always returns 0:
a slice from: $ host kernel.org 8.8.8.8:

sendmsg(20, {msg_name(16)={sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("8.8.8.8")}, msg_iov(1)=[{"\266\344\1\0\0\1\0\0\0\0\0\0\6k
ernel\3org\0\0\1\0\1", 28}], msg_controllen=0, msg_flags=0}, 0) = 28            
epoll_wait(3, {}, 64, 0)                = 0                                     
epoll_wait(3, {}, 64, 4999)             = 0

Though tcpdump shows a normal reply:

20:28:44.162897 IP 10.7.74.7.43167 > 8.8.8.8.domain: 46820+ A? kernel.org. (28) 
20:28:44.221308 IP 8.8.8.8.domain > 10.7.74.7.43167: 46820 1/0/0 A 149.20.4.69
(44)

After this bug has occured, it is no longer possible to perform DNS request on
the crippled system. I tried to stop/restart all network-related daemons, to
recreate network interfaces whenever possible (e.g. pppX devices), but with no
help. I use iptables and ebtables on this host, but reseting them (flushing all
chains, removing user chains, setting all policies to ACCEPT) doesn't help. The
only worknig solution is to reboot the system.

This bug happens rarely and randomly (about once in 7-12 days on 24x7 available
production system), but I had it 5 times already. Due to rare and random nature
of the bug I can't bisect it.

This problem occured after I updated vanilla kernel from 2.6.39.4 to 3.4.6.
Afterward I updated kernel to 3.4.10 in the hope that this will fix the
problem, but with no result. (I updated kernel due to commit
2ce42ec4ef551b08d2e5d26775d838ac640f82ad, which describes somewhat similar
issue, though I don't use I/OAT engine due to lack of hardware support.)

More details, attached trace files and kernel configs are available at bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=48081

In a few days I'll try 3.4.12 (I need to rebuild kernel anyway due to unrelated
issue) and will report if this bug will occur again. But please note it may
take several weeks to check this.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-02-04 15:21 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-03 19:25 Kernel recieves DNS reply, but doesn't deliver it to a waiting application Andrew Savchenko
2012-10-13 12:36 ` [BUG] " Andrew Savchenko
2012-10-13 13:44   ` Eric Dumazet
2012-10-13 23:11     ` Andrew Savchenko
2012-10-20 23:25       ` Andrew Savchenko
2012-10-21 12:52         ` Eric Dumazet
2012-10-22  3:36           ` Andrew Savchenko
2012-10-22  6:48             ` Eric Dumazet
2012-10-22 21:27               ` Andrew Savchenko
2012-12-12  8:27                 ` Andrew Savchenko
2012-12-23 11:06                   ` Andrew Savchenko
2012-12-28 18:11                     ` Eric Dumazet
2013-01-16 16:36                       ` Andrew Savchenko
2013-02-04 13:39                         ` Andrew Savchenko
2013-02-04 15:21                           ` Eric Dumazet
2012-11-23  7:45         ` Andrew Savchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).