netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Holger Eitzenberger <holger@eitzenberger.org>
To: netfilter-devel <netfilter-devel@lists.netfilter.org>
Cc: netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: ctnetlink loop
Date: Fri, 3 Dec 2010 14:39:03 +0100	[thread overview]
Message-ID: <20101203133903.GG13225@mail.eitzenberger.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]

Hi,

I see a problem with how ctnetlink GET requests are being
processed in the kernel (2.6.32.24) under high load.

Initially I saw this problem on a large performance testing
system when getting HTTP proxy performance numbers, but lately
there have been two reports on large customers boxes (both
many-core with 10G NICs).

The sympton is Netlink looping around nfnetlink_rcv_msg(),
which is just because netlink_unicast() came back with -EAGAIN
when trying to write the newly created Netlink skb to the SK
receive buffer in ctnetlink_get_conntrack().  In this case a
(possibly) infinit loop is entered.  Mostly infinit in fact in
case the userland party trying to receive those messages may
be stuck in the sendmsg() call, being unable to read anything
if being single threaded.

I tried to reproduce several times, a few times the loop
disappeared and the box proceeded normally after some time.
I have no explanation for this.

The attached patch tries to solve it by simple not trying
again to netlink_unicast() the reply skb and just fail with
-ENOBUFS.  The reasoning is that at the point a Netlink overrun
is observed it seems counter intuitive to insist on sending
one more Netlink message.

I checked for possible side effects to other Netlink requests,
please check.

The patch applies to net-next-2.6.

Feedback appreciated.

 /holger


[-- Attachment #2: nfnl-fix.diff --]
[-- Type: text/x-diff, Size: 1696 bytes --]

nfnetlink: avoid unbound loop on busy Netlink socket

I see a problem with how ctnetlink GET requests are being
processed in the kernel (2.6.32.24) under high load.

The sympton is Netlink looping around nfnetlink_rcv_msg(), which
is just because netlink_unicast() came back with EAGAIN when
trying to write the newly created Netlink skb to the SK receive
buffer in ctnetlink_get_conntrack().  In this case a (possibly)
infinit loop is entered.  Mostly infinit I think in case the
userland party trying to receive those messages may be stuck in
the sendmsg() call, being unable to read anything if being single
threaded.

I tried to reproduce several times, a few times the loop
disappeared and the box proceeded normally after some minutes.
I have no explanation for this.

The attached patch tries to solve it by simple not trying again
to netlink_unicast() the reply skb and just fail with -ENOBUFS.
The reasoning is that at the point a Netlink overrun is detected
it seems counter intuitive to insist on sending one more Netlink
message.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>

Index: net-next-2.6/net/netfilter/nfnetlink.c
===================================================================
--- net-next-2.6.orig/net/netfilter/nfnetlink.c	2010-12-03 14:33:32.000000000 +0100
+++ net-next-2.6/net/netfilter/nfnetlink.c	2010-12-03 14:34:21.000000000 +0100
@@ -138,7 +138,6 @@
 		return 0;
 
 	type = nlh->nlmsg_type;
-replay:
 	ss = nfnetlink_get_subsys(type);
 	if (!ss) {
 #ifdef CONFIG_MODULES
@@ -169,7 +168,7 @@
 
 		err = nc->call(net->nfnl, skb, nlh, (const struct nlattr **)cda);
 		if (err == -EAGAIN)
-			goto replay;
+			err = -ENOBUFS;
 		return err;
 	}
 }

             reply	other threads:[~2010-12-03 13:39 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-03 13:39 Holger Eitzenberger [this message]
2010-12-03 13:58 ` ctnetlink loop Holger Eitzenberger
2010-12-09 10:56 ` Pablo Neira Ayuso
2010-12-09 15:23   ` Holger Eitzenberger
2010-12-10 22:01   ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2010-12-08 17:50 David Miller
2010-12-08 20:31 ` Holger Eitzenberger
2010-12-09 10:39   ` Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101203133903.GG13225@mail.eitzenberger.org \
    --to=holger@eitzenberger.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@lists.netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).