* Kernel panic in inet_twdr_do_twkill_work @ 2009-05-14 1:22 Eric W. Biederman 2009-05-14 7:53 ` Daniel Lezcano 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2009-05-14 1:22 UTC (permalink / raw) To: Daniel Lezcano; +Cc: netdev, Denis V. Lunev So far I have only seen this twice. But the backtrace looks almost identical to the one in commit d315492b1a6ba29da0fa2860759505ae1b2db857 The kernels I saw this on were patched version of 2.6.28 with some network namespace backports. commit d315492b1a6ba29da0fa2860759505ae1b2db857 was definitely present. Daniel any ideas? Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 1:22 Kernel panic in inet_twdr_do_twkill_work Eric W. Biederman @ 2009-05-14 7:53 ` Daniel Lezcano 2009-05-14 8:18 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Daniel Lezcano @ 2009-05-14 7:53 UTC (permalink / raw) To: Eric W. Biederman; +Cc: netdev, Denis V. Lunev Eric W. Biederman wrote: > So far I have only seen this twice. But the backtrace looks > almost identical to the one in commit d315492b1a6ba29da0fa2860759505ae1b2db857 > > The kernels I saw this on were patched version of 2.6.28 with some > network namespace backports. commit > d315492b1a6ba29da0fa2860759505ae1b2db857 was definitely present. > > Daniel any ideas? > Hi Eric, I found this one. May be it could be related to your problem: commit 2bad35b7c9588eb5e65c03bcae54e7eb6b1a6504 Let me know :) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 7:53 ` Daniel Lezcano @ 2009-05-14 8:18 ` Eric W. Biederman 2009-05-14 8:33 ` Daniel Lezcano 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2009-05-14 8:18 UTC (permalink / raw) To: Daniel Lezcano; +Cc: netdev, Denis V. Lunev Daniel Lezcano <daniel.lezcano@free.fr> writes: > Eric W. Biederman wrote: >> So far I have only seen this twice. But the backtrace looks >> almost identical to the one in commit d315492b1a6ba29da0fa2860759505ae1b2db857 >> >> The kernels I saw this on were patched version of 2.6.28 with some >> network namespace backports. commit >> d315492b1a6ba29da0fa2860759505ae1b2db857 was definitely present. >> >> Daniel any ideas? >> > Hi Eric, > > I found this one. May be it could be related to your problem: > > commit 2bad35b7c9588eb5e65c03bcae54e7eb6b1a6504 > > Let me know :) "netns: oops in ip[6]_frag_reasm incrementing stats" does not look likely. There is no real ipv6 traffic currently on the our network and the panic is definitely in inet_twdr_do_twkill_work. Further we are getting the net of a timewait socket. So I don't see how a problem with NULL devs could have anything to do with it. I really suspect the purge code is not being successful. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 8:18 ` Eric W. Biederman @ 2009-05-14 8:33 ` Daniel Lezcano 2009-05-14 9:13 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Daniel Lezcano @ 2009-05-14 8:33 UTC (permalink / raw) To: Eric W. Biederman; +Cc: netdev, Denis V. Lunev Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > > >> Eric W. Biederman wrote: >> >>> So far I have only seen this twice. But the backtrace looks >>> almost identical to the one in commit d315492b1a6ba29da0fa2860759505ae1b2db857 >>> >>> The kernels I saw this on were patched version of 2.6.28 with some >>> network namespace backports. commit >>> d315492b1a6ba29da0fa2860759505ae1b2db857 was definitely present. >>> >>> Daniel any ideas? >>> >>> >> Hi Eric, >> >> I found this one. May be it could be related to your problem: >> >> commit 2bad35b7c9588eb5e65c03bcae54e7eb6b1a6504 >> >> Let me know :) >> > > "netns: oops in ip[6]_frag_reasm incrementing stats" does not look likely. > > There is no real ipv6 traffic currently on the our network and the panic > is definitely in inet_twdr_do_twkill_work. > > Further we are getting the net of a timewait socket. So I don't see how > a problem with NULL devs could have anything to do with it. > > I really suspect the purge code is not being successful. > May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait socket were destroyed at the namespace destruction ? Unfortunately it looks like the option is not in the Kconfig :( ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 8:33 ` Daniel Lezcano @ 2009-05-14 9:13 ` Eric W. Biederman 2009-05-14 9:21 ` Daniel Lezcano ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Eric W. Biederman @ 2009-05-14 9:13 UTC (permalink / raw) To: Daniel Lezcano; +Cc: netdev, Denis V. Lunev Daniel Lezcano <daniel.lezcano@free.fr> writes: > May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait > socket > were destroyed at the namespace destruction ? Unfortunately it looks like the > option is not in the Kconfig :( Looks like a good starting place. I will enable that when I respin my internal kernel. I don't have a good reproducer at the moment.... So I was hoping we could figure this out with code inspection. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 9:13 ` Eric W. Biederman @ 2009-05-14 9:21 ` Daniel Lezcano 2009-05-14 9:42 ` Daniel Lezcano 2009-05-24 13:26 ` Daniel Lezcano 2 siblings, 0 replies; 10+ messages in thread From: Daniel Lezcano @ 2009-05-14 9:21 UTC (permalink / raw) To: Eric W. Biederman; +Cc: netdev, Denis V. Lunev Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > > >> May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait >> socket >> were destroyed at the namespace destruction ? Unfortunately it looks like the >> option is not in the Kconfig :( >> > > Looks like a good starting place. > > I will enable that when I respin my internal kernel. > > I don't have a good reproducer at the moment.... So I was hoping we could > figure this out with code inspection. > I remember I wrote a small program to create hundred of timewait sockets to test the purge. I will look if I can found it and try as a good reproducer. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 9:13 ` Eric W. Biederman 2009-05-14 9:21 ` Daniel Lezcano @ 2009-05-14 9:42 ` Daniel Lezcano 2009-05-24 13:26 ` Daniel Lezcano 2 siblings, 0 replies; 10+ messages in thread From: Daniel Lezcano @ 2009-05-14 9:42 UTC (permalink / raw) To: Eric W. Biederman; +Cc: netdev, Denis V. Lunev [-- Attachment #1: Type: text/plain, Size: 690 bytes --] Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > > >> May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait >> socket >> were destroyed at the namespace destruction ? Unfortunately it looks like the >> option is not in the Kconfig :( >> > > Looks like a good starting place. > > I will enable that when I respin my internal kernel. > > I don't have a good reproducer at the moment.... So I was hoping we could > figure this out with code inspection. > I found this one which makes a lot of timewait sockets. I tried on a 2.6.29 kernel and I was not able to reproduce it. Can you check if this program reproduce the bug ? [-- Attachment #2: timewait.c --] [-- Type: text/x-csrc, Size: 2286 bytes --] #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/poll.h> #include <netinet/in.h> #include <netinet/tcp.h> #include <arpa/inet.h> #include <unistd.h> #define MAXCONN 10000 int client(int *fds) { int i, len; struct sockaddr_in6 addr; close(fds[1]); memset(&addr, 0, sizeof(addr)); addr.sin6_family = AF_INET6; addr.sin6_port = htons(10000); addr.sin6_addr = in6addr_loopback; if (read(fds[0], &i, sizeof(i)) == -1) { perror("read"); return 1; } for (i = 0; i < MAXCONN; i++) { int fd = socket(PF_INET6, SOCK_STREAM, 0); if (fd == -1) { perror("socket"); return 1; } if (connect(fd, (const struct sockaddr *)&addr, sizeof(addr))) { perror("connect"); return 1; } len = write(fd, &fd, sizeof(fd)); if (!len) { fprintf(stderr, "write wrote 0 bytes\n"); return 1; } if (len == -1) { perror("write"); return 1; } } return 0; } int server(int *fds) { int i, fd, fdpoll[MAXCONN]; struct sockaddr_in6 addr; socklen_t socklen = sizeof(addr); close(fds[0]); fd = socket(PF_INET6, SOCK_STREAM, 0); if (fd == -1) { perror("socket"); return 1; } memset(&addr, 0, sizeof(addr)); addr.sin6_family = AF_INET6; addr.sin6_port = htons(10000); addr.sin6_addr = in6addr_loopback; if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &fd, sizeof(fd))) { perror("setsockopt"); return 1; } if (bind(fd, (const struct sockaddr *)&addr, sizeof(addr))) { perror("bind"); return 1; } if (listen(fd, MAXCONN)) { perror("listen"); return 1; } if (write(fds[1], &i, sizeof(i)) == -1) { perror("write"); return 1; } for (i = 0; i < MAXCONN; i++) { int len, f = accept(fd, (struct sockaddr *)&addr, &socklen); if (f == -1) { perror("accept"); return 1; } fdpoll[i] = f; len = read(f, &f, sizeof(f)); if (!len) { fprintf(stderr, "read readen 0 bytes\n"); return 1; } if (len == -1) { perror("read"); return 1; } } return 0; } int main(int argc, char *argv[]) { int fds[2]; int pid; if (pipe(fds)) { perror("pipe"); return 1; } pid = fork(); if (pid == -1) { perror("fork"); return 1; } if (!pid) return client(fds); else return server(fds); } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-14 9:13 ` Eric W. Biederman 2009-05-14 9:21 ` Daniel Lezcano 2009-05-14 9:42 ` Daniel Lezcano @ 2009-05-24 13:26 ` Daniel Lezcano 2009-05-24 13:54 ` Eric W. Biederman 2009-06-03 0:40 ` Eric W. Biederman 2 siblings, 2 replies; 10+ messages in thread From: Daniel Lezcano @ 2009-05-24 13:26 UTC (permalink / raw) To: Eric W. Biederman; +Cc: netdev, Denis V. Lunev Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > > >> May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait >> socket >> were destroyed at the namespace destruction ? Unfortunately it looks like the >> option is not in the Kconfig :( >> > > Looks like a good starting place. > > I will enable that when I respin my internal kernel. > > I don't have a good reproducer at the moment.... So I was hoping we could > figure this out with code inspection. > Hi Eric, did you succeeded to reproduce the bug with the test program I sent you ? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-24 13:26 ` Daniel Lezcano @ 2009-05-24 13:54 ` Eric W. Biederman 2009-06-03 0:40 ` Eric W. Biederman 1 sibling, 0 replies; 10+ messages in thread From: Eric W. Biederman @ 2009-05-24 13:54 UTC (permalink / raw) To: Daniel Lezcano; +Cc: netdev, Denis V. Lunev Daniel Lezcano <daniel.lezcano@free.fr> writes: > did you succeeded to reproduce the bug with the test program I sent you ? Grr. My apologies. I haven't had a chance to play with that yet. Thank you for the reminder. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Kernel panic in inet_twdr_do_twkill_work 2009-05-24 13:26 ` Daniel Lezcano 2009-05-24 13:54 ` Eric W. Biederman @ 2009-06-03 0:40 ` Eric W. Biederman 1 sibling, 0 replies; 10+ messages in thread From: Eric W. Biederman @ 2009-06-03 0:40 UTC (permalink / raw) To: Daniel Lezcano; +Cc: netdev, Denis V. Lunev Daniel Lezcano <daniel.lezcano@free.fr> writes: > Eric W. Biederman wrote: >> Daniel Lezcano <daniel.lezcano@free.fr> writes: >> >> >>> May be you can activate the NETNS_REFCNT_DEBUG in order to check if the timewait >>> socket >>> were destroyed at the namespace destruction ? Unfortunately it looks like the >>> option is not in the Kconfig :( >>> >> >> Looks like a good starting place. >> >> I will enable that when I respin my internal kernel. >> >> I don't have a good reproducer at the moment.... So I was hoping we could >> figure this out with code inspection. >> > Hi Eric, > > did you succeeded to reproduce the bug with the test program I sent you ? Weird. I finally got around to running your little test app, and I don't trigger it here. At the same time I am starting to see what I think is this error more often. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-03 0:40 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-14 1:22 Kernel panic in inet_twdr_do_twkill_work Eric W. Biederman 2009-05-14 7:53 ` Daniel Lezcano 2009-05-14 8:18 ` Eric W. Biederman 2009-05-14 8:33 ` Daniel Lezcano 2009-05-14 9:13 ` Eric W. Biederman 2009-05-14 9:21 ` Daniel Lezcano 2009-05-14 9:42 ` Daniel Lezcano 2009-05-24 13:26 ` Daniel Lezcano 2009-05-24 13:54 ` Eric W. Biederman 2009-06-03 0:40 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).