Re: scheduling while atomic followed by oops upon conntrackd -c execution

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Kerin Millar <kerframil@gmail.com>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: scheduling while atomic followed by oops upon conntrackd -c execution
Date: Tue, 6 Mar 2012 12:14:27 +0100	[thread overview]
Message-ID: <20120306111427.GA448@1984> (raw)
In-Reply-To: <jj2sjo$i8a$1@dough.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5468 bytes --]

Hi Kerin,

On Mon, Mar 05, 2012 at 05:19:49PM +0000, Kerin Millar wrote:
> Hi Pablo,
> 
> On 04/03/2012 11:01, Pablo Neira Ayuso wrote:
> >Hi Kerin,
> >
> >On Sat, Mar 03, 2012 at 06:47:27PM +0000, Kerin Millar wrote:
> >>Hi,
> >>
> >>On 03/03/2012 13:30, Pablo Neira Ayuso wrote:
> >>>I just posted another patch to the ML that is a relative fix to
> >>>Jozsef's patch. You have to apply that as well.
> >>
> >>I've now tested 3.3-rc5 with the addition of the above mentioned
> >>follow-on patch. The behaviour during conntrackd -c execution is
> >>clearly much improved - in so far as it doesn't generate much noise
> >>- but the crash that follows remains. Here's a netconsole capture:-
> >>
> >>http://paste.pocoo.org/raw/560439/
> >
> >Great to know :-).
> 
> I apologize but I think I may have led you astray on the nf_nat
> issue. At the time of submitting my original report, I now believe
> that the nf_nat module wasn't loaded prior to starting conntrackd,
> although it was definitely available. For all tests that followed,
> however, I am entirely certain the the nf_nat module was loaded in
> advance. The upshot is that my claim that things had improved may
> have been premature; I need to specifically test under both
> circumstances to be sure that things are improving. That is, both
> with and without the module loaded in advance.
> 
> Following my own advice then, I first tried going through my test
> case *without* loading nf_nat in advance. Alas, conntrackd -c
> triggered hard lockups and didn't return to prompt. Here are the
> results:-
> 
> http://paste.pocoo.org/raw/561350/
> 
> In case it matters, the existing ssh session continued to respond to
> input but I was no longer able to initiate any new sessions.
> 
> >
> >Regarding your previous email, I'm sorry, by reading your email I
> >thought you were using 2.6.32 which was not the case, your
> >configuration is perfectly reasonable.
> >
> >It seems we still have problems regarding early_drop, but this time
> >with reliable event delivery enabled (15 seconds is the time that
> >is required to retry sending the destroy event).
> >
> >If you can test the following patch, I'll appreciate.
> 
> Gladly. I applied the patch to my 3.3-rc5 tree, which is still
> carrying the two patches discussed earlier in the thread. I then
> went through my test case under normal circumstances i.e. all
> firewall rules in place, nf_nat confirmed present before conntrackd
> etc. Again, conntrackd -c did not return to prompt. Here are the
> results:-
> 
> http://paste.pocoo.org/raw/561354/
>
> Well, at least there was no oops this time. I should also add that
> the patch was present for both of the tests mentioned in this email.

Previous patch that I sent you was not OK, sorry. I have committed the
following to my git tree:

http://1984.lsi.us.es/git/net/commit/?id=691d47b2dc8fdb8fea5a2b59c46e70363fa66897

I've been using the following tools that you can find enclosed to this
email, they are much more simple than conntrackd but, they do the same
in essence:

* conntrack_stress.c
* conntrack_events.c

gcc -lnetfilter_conntrack conntrack_stress.c -o ct_stress
gcc -lnetfilter_conntrack conntrack_events.c -o ct_events

Then, to listen to events with reliable event delivery enabled:

# ./ct_events &

And to create loads of flow entries in ASSURED state:

# ./ct_stress 65535 # that's my ct table size in my laptop

You'll hit ENOMEM errors at some point, that's fine, but no oops or
lockups happen here.

I have pushed this tools to the qa/ directory under
libnetfilter_conntrack:

commit 94e75add9867fb6f0e05e73b23f723f139da829e
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Tue Mar 6 12:10:55 2012 +0100

    qa: add some stress tools to test conntrack via ctnetlink

(BTW, ct_stress may disrupt your network connection since the table
gets filled. You can use conntrack -F to get the ct table empty again).

> ---
> Incidentally, I found out why the internal cache on the master was
> filling up to capacity. It was apparently due to the use of
> "iptables -I PREROUTING -t raw -j CT --ctevents assured". Perhaps
> I'm missing something but doesn't this stop events such as new and
> destroy from being propagated? An inspection with conntrack -E
> suggests so. Once I removed the above rule, I could see destroy
> events being propagated and the number of active connections in the
> cache no longer exceeded my chosen limit of 2097152 ...

Yes, that line was wrong, I have fixed in the documentation, the
correct one must be:

iptables -I PREROUTING -t raw -j CT --ctevents assured,destroy

Thus, destroy events are delivered to user-space.

> # conntrack -S | head -n1; conntrackd -s | head -n2
> entries                 725826
> cache internal:
> current active connections:          1409472
> 
> Whatever the case, I'm quite happy to go without this rule as these
> systems are coping fine with the load incurred by conntrackd.

I want to get things fixed, please, don't give up on using that rule
yet :-).

Regarding the hardlockups. I'd be happy if you can re-do the tests,
both with conntrackd and the tools that I send you.

Make sure you have these three patches, note that the last one has
changed.

http://1984.lsi.us.es/git/net/commit/?id=7d367e06688dc7a2cc98c2ace04e1296e1d987e2
http://1984.lsi.us.es/git/net/commit/?id=a8f341e98a46f579061fabfe6ea50be3d0eb2c60
http://1984.lsi.us.es/git/net/commit/?id=691d47b2dc8fdb8fea5a2b59c46e70363fa66897

Thanks!

[-- Attachment #2: conntrack_events.c --]
[-- Type: text/x-csrc, Size: 1153 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#include <libnetfilter_conntrack/libnetfilter_conntrack.h>

static int event_cb(enum nf_conntrack_msg_type type,
		    struct nf_conntrack *ct,
		    void *data)
{
	static int i = 0;
	static int new, destroy;

	if (type == NFCT_T_NEW)
		new++;
	else if (type == NFCT_T_DESTROY)
		destroy++;

	if ((++i % 10000) == 0)
		printf("%d events received (%d new, %d destroy)\n",
			i, new, destroy);

	return NFCT_CB_CONTINUE;
}

int main(void)
{
	int ret;
	struct nfct_handle *h;
	int on = 1;

	h = nfct_open(CONNTRACK, NFCT_ALL_CT_GROUPS);
	if (!h) {
		perror("nfct_open");
		return 0;
	}

	setsockopt(nfct_fd(h), SOL_NETLINK,
			NETLINK_BROADCAST_SEND_ERROR, &on, sizeof(int));
	setsockopt(nfct_fd(h), SOL_NETLINK,
			NETLINK_NO_ENOBUFS, &on, sizeof(int));

	nfct_callback_register(h, NFCT_T_ALL, event_cb, NULL);

	printf("TEST: waiting for events...\n");

	ret = nfct_catch(h);

	printf("TEST: conntrack events ");
	if (ret == -1)
		printf("(%d)(%s)\n", ret, strerror(errno));
	else
		printf("(OK)\n");

	nfct_close(h);

	ret == -1 ? exit(EXIT_FAILURE) : exit(EXIT_SUCCESS);
}

[-- Attachment #3: conntrack_stress.c --]
[-- Type: text/x-csrc, Size: 1487 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <arpa/inet.h>

#include <libnetfilter_conntrack/libnetfilter_conntrack.h>
#include <libnetfilter_conntrack/libnetfilter_conntrack_tcp.h>

int main(int argc, char *argv[])
{
	time_t t;
	int ret, i, r;
	struct nfct_handle *h;
	struct nf_conntrack *ct;

	if (argc < 2) {
		fprintf(stderr, "Usage: %s [ct_table_size]\n", argv[0]);
		exit(EXIT_FAILURE);
	}

	time(&t);
	srandom(t);
	r = random();

	ct = nfct_new();
	if (!ct) {
		perror("nfct_new");
		return 0;
	}

	h = nfct_open(CONNTRACK, 0);
	if (!h) {
		perror("nfct_open");
		nfct_destroy(ct);
		return -1;
	}

	for (i = r;i < (r + atoi(argv[1]) * 2); i++) {
		nfct_set_attr_u8(ct, ATTR_L3PROTO, AF_INET);
		nfct_set_attr_u32(ct, ATTR_IPV4_SRC, inet_addr("1.1.1.1") + i);
		nfct_set_attr_u32(ct, ATTR_IPV4_DST, inet_addr("2.2.2.2") + i);

		nfct_set_attr_u8(ct, ATTR_L4PROTO, IPPROTO_TCP);
		nfct_set_attr_u16(ct, ATTR_PORT_SRC, htons(10));
		nfct_set_attr_u16(ct, ATTR_PORT_DST, htons(20));

		nfct_setobjopt(ct, NFCT_SOPT_SETUP_REPLY);

		nfct_set_attr_u8(ct, ATTR_TCP_STATE, TCP_CONNTRACK_ESTABLISHED);
		nfct_set_attr_u32(ct, ATTR_TIMEOUT, 1000);
		nfct_set_attr_u32(ct, ATTR_STATUS, IPS_ASSURED);

		if (i % 10000 == 0)
			printf("added %d flow entries\n", i);

		ret = nfct_query(h, NFCT_Q_CREATE, ct);
		if (ret == -1)
			perror("nfct_query: ");
	}
	nfct_close(h);

	nfct_destroy(ct);

	ret == -1 ? exit(EXIT_FAILURE) : exit(EXIT_SUCCESS);
}

next prev parent reply	other threads:[~2012-03-06 11:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-02 15:11 scheduling while atomic followed by oops upon conntrackd -c execution Kerin Millar
2012-03-03 13:30 ` Pablo Neira Ayuso
2012-03-03 17:49   ` Kerin Millar
2012-03-03 18:47   ` Kerin Millar
2012-03-04 11:01     ` Pablo Neira Ayuso
2012-03-05 17:19       ` Kerin Millar
2012-03-06 11:14         ` Pablo Neira Ayuso [this message]
2012-03-06 16:42           ` Kerin Millar
2012-03-06 17:23             ` Pablo Neira Ayuso
2012-03-06 22:37               ` Kerin Millar
2012-03-07 14:41                 ` Kerin Millar
2012-03-08  1:33                   ` Pablo Neira Ayuso
2012-03-08 11:00                     ` Kerin Millar
2012-03-08 11:29                     ` Kerin Millar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120306111427.GA448@1984 \
    --to=pablo@netfilter.org \
    --cc=kerframil@gmail.com \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).