public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* epoll oops on 2.6.x
@ 2004-05-13 21:39 Louay Gammo
  2004-05-14  1:58 ` Fw: " Davide Libenzi
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Louay Gammo @ 2004-05-13 21:39 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 412 bytes --]

Hi,

I keep getting these oopses and kernel panics when I overload my 
Itanium-2 box with http requests. I had these
oopses in 2.6.1 through 2.6.5 and it is consistently happening while 
using epoll, but not select. Also, these oopses
happen only on IA-64.

I am including some of these oopses as attachements. I am not sure what 
the 'normal' bug reporting mechanism is
for IA-64 kernels.

Thanks

Louay Gammo


[-- Attachment #2: epolloops.tar.gz --]
[-- Type: application/x-gzip, Size: 18809 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fw: epoll oops on 2.6.x
  2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
@ 2004-05-14  1:58 ` Davide Libenzi
  2004-05-14 22:23 ` Louay Gammo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2004-05-14  1:58 UTC (permalink / raw)
  To: linux-ia64

On Thu, 13 May 2004, Randy.Dunlap wrote:

> 
> Davide,
> >From the linux-ia64 mailing list, in case you didn't see it.....
> 
> Begin forwarded message:
> 
> Date: Thu, 13 May 2004 17:39:02 -0400
> From: Louay Gammo <lgammo@cs.uwaterloo.ca>
> To: linux-ia64@vger.kernel.org
> Subject: epoll oops on 2.6.x
> 
> 
> Hi,
> 
> I keep getting these oopses and kernel panics when I overload my 
> Itanium-2 box with http requests. I had these
> oopses in 2.6.1 through 2.6.5 and it is consistently happening while 
> using epoll, but not select. Also, these oopses
> happen only on IA-64.
> 
> I am including some of these oopses as attachements. I am not sure what 
> the 'normal' bug reporting mechanism is
> for IA-64 kernels.

I'm getting this from Randy, since I was not subscribed to linux-ia64. Now 
I am. I took a quick look at it and so far the clue is completely missing. 
Only one of the oops actually happen inside epoll, even if the other ones 
comes typically from __wake_up_common() (others from vfs_read(), that is 
quite unrelated), that let me think that it might be epoll related (epoll 
has a wait queue item dropped in). Can you try a few experiments?

1) Disable pre-emption (if enabled)
2) Load the server thru loopback
3) #define DEBUG_EPOLL 10

(Your machine does not look too much in shape looking at INIT messages 
though)



- Davide


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fw: epoll oops on 2.6.x
  2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
  2004-05-14  1:58 ` Fw: " Davide Libenzi
@ 2004-05-14 22:23 ` Louay Gammo
  2004-05-15  0:49 ` Davide Libenzi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Louay Gammo @ 2004-05-14 22:23 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 4287 bytes --]

Davide Libenzi wrote:

>> including some of these oopses as attachements. I am not sure what 
>>the 'normal' bug reporting mechanism is
>>for IA-64 kernels.
>>    
>>
>
>I'm getting this from Randy, since I was not subscribed to linux-ia64. Now 
>I am. I took a quick look at it and so far the clue is completely missing. 
>Only one of the oops actually happen inside epoll, even if the other ones 
>comes typically from __wake_up_common() (others from vfs_read(), that is 
>quite unrelated), that let me think that it might be epoll related (epoll 
>has a wait queue item dropped in). Can you try a few experiments?
>
>1) Disable pre-emption (if enabled)
>2) Load the server thru loopback
>3) #define DEBUG_EPOLL 10
>  
>

I did #1 and #2 and ran my original experiment. I did not get a panic but
I got 0 replies from the server. So I undid the #2 and I got a kernel 
panic (included below).
So I am going to try to overload the server on loopback interface (I think
this is what you are saying) and I will let you know of the results.

I am enclosing the kernel config just in case that might clue anyone to 
a configuration problem.

Thanks,

Louay Gammo

tissimo:~# Unable to handle kernel paging request at virtual address 
000000000001003e
userver[1284]: Oops 8804682956800 [1]
                                     
Pid: 1284, CPU 0, comm:              userver
psr : 0000101008022018 ifs : 8000000000000003 ip  : 
[<a000000100080620>]    Not tainted
ip is at prepare_to_wait_exclusive+0x80/0xe0
unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000158659
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a000000100530360 b6  : a000000100002d70 b7  : a000000100581980
f6  : 000000000000000000000 f7  : 000000000000000000000
f8  : 000000000000000000000 f9  : 000000000000000000000
f10 : 000000000000000000000 f11 : 000000000000000000000
r1  : a000000100990000 r2  : e0000040f5f5fd58 r3  : 0000000000000001
r8  : 0000000000000000 r9  : 000000000001003e r10 : e0000040f5f5fd88
r11 : a000000100722900 r12 : e0000040f5f5fd30 r13 : e0000040f5f58000
r14 : e000004043d26160 r15 : e0000040f5f5fd58 r16 : e0000040f5f58000
r17 : e000004043880d48 r18 : 0000001008026018 r19 : 0000000000000000
r20 : 000000003fffff00 r21 : e0000040f5f5fd58 r22 : ffffffffffff0020
r23 : a0000001008e9ab0 r24 : e0000040f5f5fd78 r25 : a0000001008ea198
r26 : e0000040f5f5fd70 r27 : e000004042206088 r28 : e000004042206090
r29 : e0000040422060a8 r30 : e0000040422060b0 r31 : e0000040422060b8
                                                                    
Call Trace:
 [<a000000100017ae0>] show_stack+0x80/0xa0
                                sp=e0000040f5f5f900 bsp=e0000040f5f59378
 [<a00000010003a330>] die+0x130/0x1a0
                                sp=e0000040f5f5fad0 bsp=e0000040f5f59340
 [<a000000100054cd0>] ia64_do_page_fault+0x350/0x920
                                sp=e0000040f5f5fad0 bsp=e0000040f5f592d8
 [<a0000001000118a0>] ia64_leave_kernel+0x0/0x260
                                sp=e0000040f5f5fb60 bsp=e0000040f5f592d8
 [<a000000100080620>] prepare_to_wait_exclusive+0x80/0xe0
                                sp=e0000040f5f5fd30 bsp=e0000040f5f592c0
 [<a000000100530360>] __lock_sock+0x100/0x1a0
                                sp=e0000040f5f5fd30 bsp=e0000040f5f59298
 [<a000000100530f70>] lock_sock+0x50/0xa0
                                sp=e0000040f5f5fd90 bsp=e0000040f5f59278
 [<a0000001005c7610>] inet_accept+0xb0/0x1e0
                                sp=e0000040f5f5fd90 bsp=e0000040f5f59248
 [<a00000010052bdf0>] sys_accept+0x170/0x300
                                sp=e0000040f5f5fda0 bsp=e0000040f5f591a0
 [<a000000100011720>] ia64_ret_from_syscall+0x0/0x20
                                sp=e0000040f5f5fe30 bsp=e0000040f5f591a0
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

>(Your machine does not look too much in shape looking at INIT messages 
>though)
>
>
>
>- Davide
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  
>


[-- Attachment #2: config.gz --]
[-- Type: application/x-gzip, Size: 4836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fw: epoll oops on 2.6.x
  2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
  2004-05-14  1:58 ` Fw: " Davide Libenzi
  2004-05-14 22:23 ` Louay Gammo
@ 2004-05-15  0:49 ` Davide Libenzi
  2004-05-17 16:06 ` Louay Gammo
  2004-05-17 16:27 ` Davide Libenzi
  4 siblings, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2004-05-15  0:49 UTC (permalink / raw)
  To: linux-ia64

On Fri, 14 May 2004, Louay Gammo wrote:

> I did #1 and #2 and ran my original experiment. I did not get a panic but
> I got 0 replies from the server. So I undid the #2 and I got a kernel 
> panic (included below).
> So I am going to try to overload the server on loopback interface (I think
> this is what you are saying) and I will let you know of the results.
> 
> I am enclosing the kernel config just in case that might clue anyone to 
> a configuration problem.

Can you try also to run the appended test program, like:

# ./pipetest -n 2000 -a 3 -w 20000




- Davide





/*
 *  PipeTest by Davide Libenzi ( Epoll performace tester )
 *  Copyright (C) 1999,..,2003  Davide Libenzi
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, write to the Free Software
 *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 *
 *  Davide Libenzi <davidel@xmailserver.org>
 *
 *
 *  You need either a never glibc or the epoll library available here :
 *
 *  http://www.xmailserver.org/linux-patches/nio-improve.html#sys_epoll
 *
 *  to build this source file. To build :
 *
 *  gcc -o pipetest pipetest.c -lepoll
 *
 */

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/socket.h>
#include <sys/signal.h>
#include <sys/resource.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>

#include <sys/epoll.h>


#define RUNTIMES 16


static int count, writes, fired;
static int *pipes;
static int num_pipes, num_active, num_writes;
static int epfd;
static struct epoll_event *events;



unsigned long long getustime(void) {
	struct timeval tm;

	gettimeofday(&tm, NULL);
	return (unsigned long long) tm.tv_sec * 1000000ULL + (unsigned long long) tm.tv_usec;
}


void read_cb(int fd, int idx) {
	int widx = idx + num_active + 1;
	u_char ch;

	if (read(fd, &ch, sizeof(ch)))
		count++;
	else
		fprintf(stderr, "false read event: fd=%d idx=%d\n", fd, idx);
	if (writes) {
		if (widx >= num_pipes)
			widx -= num_pipes;
		write(pipes[2 * widx + 1], "e", 1);
		writes--;
		fired++;
	}
}


int run_once(long *work, unsigned long long *tr) {
	int i, res;
	unsigned long long ts, te;

	fired = 0;
	for (i = 0; i < num_active; i++, fired++)
		write(pipes[i * 2 + 1], "e", 1);

	count = 0;
	writes = num_writes;

	ts = getustime();
	do {
		res = epoll_wait(epfd, events, num_pipes, 0);
		for (i = 0; i < res; i++)
			read_cb(pipes[2 * events[i].data.u32], events[i].data.u32);
	} while (count != fired);
	te = getustime();

	*tr = te - ts;
	*work = count;

	return (0);
}


int main (int argc, char **argv) {
	struct rlimit rl;
	int i, c;
	long work;
	unsigned long long tr;
	int *cp;
	struct epoll_event ev;
	extern char *optarg;

	num_pipes = 100;
	num_active = 1;
	num_writes = num_pipes;
	while ((c = getopt(argc, argv, "n:a:w:")) != -1) {
		switch (c) {
		case 'n':
			num_pipes = atoi(optarg);
			break;
		case 'a':
			num_active = atoi(optarg);
			break;
		case 'w':
			num_writes = atoi(optarg);
			break;
		default:
			fprintf(stderr, "Illegal argument \"%c\"\n", c);
			exit(1);
		}
	}

	rl.rlim_cur = rl.rlim_max = num_pipes * 2 + 50;
	if (setrlimit(RLIMIT_NOFILE, &rl) = -1) {
		perror("setrlimit"); 
		exit(1);
	}

	events = calloc(num_pipes, sizeof(struct epoll_event));
	pipes = calloc(num_pipes * 2, sizeof(int));
	if (events = NULL || pipes = NULL) {
		perror("malloc");
		exit(1);
	}

	if ((epfd = epoll_create(num_pipes)) = -1) {
		perror("epoll_create");
		exit(1);
	}

	for (cp = pipes, i = 0; i < num_pipes; i++, cp += 2) {
		if (pipe(cp) = -1) {
			perror("pipe");
			exit(1);
		}
		fcntl(cp[0], F_SETFL, fcntl(cp[0], F_GETFL) | O_NONBLOCK);
	}

	for (cp = pipes, i = 0; i < num_pipes; i++, cp += 2) {
		ev.events = EPOLLIN | EPOLLET;
		ev.data.u32 = i;
		if (epoll_ctl(epfd, EPOLL_CTL_ADD, cp[0], &ev) < 0) {
			perror("epoll_ctl");
			exit(1);
		}
	}

	for (i = 0; i < RUNTIMES; i++) {
		run_once(&work, &tr);
		if (!work)
			exit(1);
		fprintf(stdout, "%lf\n", (double) tr / (double) work);
	}

	exit(0);
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fw: epoll oops on 2.6.x
  2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
                   ` (2 preceding siblings ...)
  2004-05-15  0:49 ` Davide Libenzi
@ 2004-05-17 16:06 ` Louay Gammo
  2004-05-17 16:27 ` Davide Libenzi
  4 siblings, 0 replies; 6+ messages in thread
From: Louay Gammo @ 2004-05-17 16:06 UTC (permalink / raw)
  To: linux-ia64

Davide Libenzi wrote:

>On Fri, 14 May 2004, Louay Gammo wrote:
>
>  
>
>>I did #1 and #2 and ran my original experiment. I did not get a panic but
>>I got 0 replies from the server. So I undid the #2 and I got a kernel 
>>panic (included below).
>>So I am going to try to overload the server on loopback interface (I think
>>this is what you are saying) and I will let you know of the results.
>>
>>I am enclosing the kernel config just in case that might clue anyone to 
>>a configuration problem.
>>    
>>
>
>Can you try also to run the appended test program, like:
>
># ./pipetest -n 2000 -a 3 -w 20000
>
>  
>
Ok, and I got this:
~/pipetest> ./a.out -n 2000 -a 3 -w 20000
3.917762
3.868220
3.860521
3.855072
3.853522
3.853422
3.854822
3.853072
3.853972
3.854972
3.856022
3.855172
3.856572
3.860621
3.861171
3.855672

Does this help?

>
>
>- Davide
>
>
>  
>
Louay Gammo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fw: epoll oops on 2.6.x
  2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
                   ` (3 preceding siblings ...)
  2004-05-17 16:06 ` Louay Gammo
@ 2004-05-17 16:27 ` Davide Libenzi
  4 siblings, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2004-05-17 16:27 UTC (permalink / raw)
  To: linux-ia64

On Mon, 17 May 2004, Louay Gammo wrote:

> Davide Libenzi wrote:
> 
> >On Fri, 14 May 2004, Louay Gammo wrote:
> >
> >>I did #1 and #2 and ran my original experiment. I did not get a panic but
> >>I got 0 replies from the server. So I undid the #2 and I got a kernel 
> >>panic (included below).
> >>So I am going to try to overload the server on loopback interface (I think
> >>this is what you are saying) and I will let you know of the results.
> >>
> >>I am enclosing the kernel config just in case that might clue anyone to 
> >>a configuration problem.
> >>    
> >>
> >
> >Can you try also to run the appended test program, like:
> >
> ># ./pipetest -n 2000 -a 3 -w 20000
> >
> >  
> >
> Ok, and I got this:
> ~/pipetest> ./a.out -n 2000 -a 3 -w 20000
> 3.917762
> 3.868220
> 3.860521
> 3.855072
> 3.853522
> 3.853422
> 3.854822
> 3.853072
> 3.853972
> 3.854972
> 3.856022
> 3.855172
> 3.856572
> 3.860621
> 3.861171
> 3.855672
> 
> Does this help?

This proves that it works fine with pipes. Can you try to put your server 
listening on localhost and load it thru localhost?



- Davide


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-05-17 16:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-13 21:39 epoll oops on 2.6.x Louay Gammo
2004-05-14  1:58 ` Fw: " Davide Libenzi
2004-05-14 22:23 ` Louay Gammo
2004-05-15  0:49 ` Davide Libenzi
2004-05-17 16:06 ` Louay Gammo
2004-05-17 16:27 ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox