* 1850/2850 hangs under I/O load
@ 2005-07-13 15:55 Keir Fraser
2005-07-13 16:43 ` Brian Hays
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-13 15:55 UTC (permalink / raw)
To: rob, barryf-lists, davidh.davidh; +Cc: xen-devel
Looking back over the emails on this topic, someone pointed out a
patch for Linux 2.6.10 that disabled software IRQ affinity for
1850/2850 systems.
You can try a similar fix on Xen (either 2.0.x or unstable) by editing
arch/x86/irq.c:pirq_guest_bind(), and remove the following lines:
if ( desc->handler->set_affinity != NULL )
desc->handler->set_affinity(<blah>);
If this fixes the I/O hangs for you, it is a nicer fix than
ignorebiostables. I can add a boot parameter to have the same effect,
and also probably have the fix applied automatically for 1850/2850
systems in the unstable tree (just like Linux).
Let me know how it works out.
-- Keir
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: 1850/2850 hangs under I/O load 2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser @ 2005-07-13 16:43 ` Brian Hays 2005-07-13 17:16 ` Keir Fraser 2005-07-13 19:39 ` David H [not found] ` <c4e0079f0507131213190d6762@mail.gmail.com> 2 siblings, 1 reply; 10+ messages in thread From: Brian Hays @ 2005-07-13 16:43 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel, barryf-lists, davidh.davidh, rob Hi all, I'm also about to install Xen on a Poweredge 1850. Is this the recommended set up (patch referenced below) for that hardware? If so, is there any thing else that may take a hit as far as performance or reliablity after making the change suggested? Thank you, Brian On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: > > Looking back over the emails on this topic, someone pointed out a > patch for Linux 2.6.10 that disabled software IRQ affinity for > 1850/2850 systems. > > You can try a similar fix on Xen (either 2.0.x or unstable) by editing > arch/x86/irq.c:pirq_guest_bind(), and remove the following lines: > > if ( desc->handler->set_affinity != NULL ) > desc->handler->set_affinity(<blah>); > > If this fixes the I/O hangs for you, it is a nicer fix than > ignorebiostables. I can add a boot parameter to have the same effect, > and also probably have the fix applied automatically for 1850/2850 > systems in the unstable tree (just like Linux). > > Let me know how it works out. > > -- Keir > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 1850/2850 hangs under I/O load 2005-07-13 16:43 ` Brian Hays @ 2005-07-13 17:16 ` Keir Fraser 0 siblings, 0 replies; 10+ messages in thread From: Keir Fraser @ 2005-07-13 17:16 UTC (permalink / raw) To: Brian Hays; +Cc: xen-devel, barryf-lists, davidh.davidh, rob On 13 Jul 2005, at 17:43, Brian Hays wrote: > I'm also about to install Xen on a Poweredge 1850. Is this the > recommended set up (patch referenced below) for that hardware? If so, > is there any thing else that may take a hit as far as performance or > reliablity after making the change suggested? The patch | just posted is a new one I've put up for testing and comments. The usual fixes that people find to work currently are to specify 'nousb' on domain0's command line, or 'ignorebiostables' on Xen's command line. -- Keir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 1850/2850 hangs under I/O load 2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser 2005-07-13 16:43 ` Brian Hays @ 2005-07-13 19:39 ` David H [not found] ` <c4e0079f0507131213190d6762@mail.gmail.com> 2 siblings, 0 replies; 10+ messages in thread From: David H @ 2005-07-13 19:39 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel, barryf-lists, rob I have some supermicro systems with the same chipset/problem as the 1850/2850. This patch appears to fix the problem for me. I could always hang the server by using scp to copy a large file. I have been able to scp this file 10 times without a hang!! However, it looks like we lose ACPI. Is that the expected outcome? cat /proc/interrups before and after: CPU0 1: 10 Phys-irq i8042 9: 0 Phys-irq acpi 12: 101 Phys-irq i8042 15: 4412096 Phys-irq ide1 16: 94555 Phys-irq uhci_hcd, uhci_hcd 18: 6161124 Phys-irq uhci_hcd 19: 0 Phys-irq uhci_hcd 48: 84003 Phys-irq 3w-xxxx 54: 3091939 Phys-irq eth0 256: 0 Dynamic-irq ctrl-if 257: 22704931 Dynamic-irq timer0 258: 0 Dynamic-irq console 259: 0 Dynamic-irq net-be-dbg NMI: 0 LOC: 0 ERR: 0 MIS: 0 CPU0 1: 10 Phys-irq i8042 12: 101 Phys-irq i8042 15: 24398 Phys-irq ide1 16: 288202 Phys-irq uhci_hcd, uhci_hcd 18: 7522500 Phys-irq uhci_hcd 19: 0 Phys-irq uhci_hcd 48: 92704 Phys-irq 3w-xxxx 54: 7764230 Phys-irq eth0 128: 1 Dynamic-irq misdirect 129: 0 Dynamic-irq ctrl-if 130: 239113 Dynamic-irq timer 131: 0 Dynamic-irq console 132: 0 Dynamic-irq net-be-dbg NMI: 0 ERR: 0 David On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: > > Looking back over the emails on this topic, someone pointed out a > patch for Linux 2.6.10 that disabled software IRQ affinity for > 1850/2850 systems. > > You can try a similar fix on Xen (either 2.0.x or unstable) by editing > arch/x86/irq.c:pirq_guest_bind(), and remove the following lines: > > if ( desc->handler->set_affinity != NULL ) > desc->handler->set_affinity(<blah>); > > If this fixes the I/O hangs for you, it is a nicer fix than > ignorebiostables. I can add a boot parameter to have the same effect, > and also probably have the fix applied automatically for 1850/2850 > systems in the unstable tree (just like Linux). > > Let me know how it works out. > > -- Keir > ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <c4e0079f0507131213190d6762@mail.gmail.com>]
[parent not found: <6e5c7f99544bb10aa8fe6663cfd8d79e@cl.cam.ac.uk>]
* Re: 1850/2850 hangs under I/O load [not found] ` <6e5c7f99544bb10aa8fe6663cfd8d79e@cl.cam.ac.uk> @ 2005-07-14 1:37 ` David H 2005-07-14 10:29 ` Keir Fraser 0 siblings, 1 reply; 10+ messages in thread From: David H @ 2005-07-14 1:37 UTC (permalink / raw) To: Keir Fraser, xen-devel Replying to the list, I forgot to reply all to the last emial: You are correct of course. Trying to do too many things at once... I set some aside some time, got my xen versions all sorted out, and did a little testing. It looks like this fixes the problem for my server in the 2.0.6 and 2.0-testing (not sure if there is any difference between the the two at this point but I thought in couldn't hurt to test). However, unstable from last nights tar ball still hangs although it to take a little longer to do so. Let me know if there is anything else I can test. Thanks you, and sorry for the earlier ACPI "crazy talk". :) David On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: > > On 13 Jul 2005, at 20:13, David H wrote: > > > I have some supermicro systems with the same chipset/problem as the > > 1850/2850. This patch appears to fix the problem for me. I could > > always hang the server by using scp to copy a large file. I have been > > able to scp this file 10 times without a hang!! However, it looks > > like we lose ACPI. Is that the expected outcome? > > This is great to hear. Also, the patch cannot possibly have any effect > at all on use of ACPI -- what makes you think ACPI usage has changed? > Maybe you are using a different kernel config? > > -- Keir > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 1850/2850 hangs under I/O load 2005-07-14 1:37 ` David H @ 2005-07-14 10:29 ` Keir Fraser 2005-07-14 10:45 ` Keir Fraser 2005-07-14 15:05 ` David H 0 siblings, 2 replies; 10+ messages in thread From: Keir Fraser @ 2005-07-14 10:29 UTC (permalink / raw) To: David H; +Cc: xen-devel On 14 Jul 2005, at 02:37, David H wrote: > I set some aside some time, got my xen versions all sorted out, and > did a little testing. It looks like this fixes the problem for my > server in the 2.0.6 and 2.0-testing (not sure if there is any > difference between the the two at this point but I thought in couldn't > hurt to test). However, unstable from last nights tar ball still hangs > although it to take a little longer to do so. Let me know if there is > anything else I can test. Are you sure you fixed up unstable properly, and definitely were running the fixed up version? It would be quite surprising if that fix worked for xen2 but not for xen3! I just did some tests myself, and fixing that one use of set_affinity in pirq_guest_bind ought to be sufficient to get unstable working on your boxes. -- Keir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: 1850/2850 hangs under I/O load 2005-07-14 10:29 ` Keir Fraser @ 2005-07-14 10:45 ` Keir Fraser 2005-07-14 11:15 ` Keir Fraser 2005-07-14 15:05 ` David H 1 sibling, 1 reply; 10+ messages in thread From: Keir Fraser @ 2005-07-14 10:45 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel, David H On 14 Jul 2005, at 11:29, Keir Fraser wrote: > Are you sure you fixed up unstable properly, and definitely were > running the fixed up version? It would be quite surprising if that fix > worked for xen2 but not for xen3! > > I just did some tests myself, and fixing that one use of set_affinity > in pirq_guest_bind ought to be sufficient to get unstable working on > your boxes. I also just added a new boot parameter 'noirqbalance' to the 2.0-testing and unstable repositories. If you're using the latest version of either of those then just add that to Xen's command line instead of manually patching the code. You can tell if you have a recent enough source tree: xen/arch/x86/irq.c will have a new opt_noirqbalance variable declared right at the top. -- Keir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: 1850/2850 hangs under I/O load 2005-07-14 10:45 ` Keir Fraser @ 2005-07-14 11:15 ` Keir Fraser 0 siblings, 0 replies; 10+ messages in thread From: Keir Fraser @ 2005-07-14 11:15 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel, David H >> Are you sure you fixed up unstable properly, and definitely were >> running the fixed up version? It would be quite surprising if that >> fix worked for xen2 but not for xen3! >> >> I just did some tests myself, and fixing that one use of set_affinity >> in pirq_guest_bind ought to be sufficient to get unstable working on >> your boxes. > > I also just added a new boot parameter 'noirqbalance' to the > 2.0-testing and unstable repositories. If you're using the latest > version of either of those then just add that to Xen's command line > instead of manually patching the code. > > You can tell if you have a recent enough source tree: > xen/arch/x86/irq.c will have a new opt_noirqbalance variable declared > right at the top. I've also checked in automatic disabling of IRQ balancing/affinity in the unstable tree, so with the latest repository you shouldn't even have to add an explicit boot parameter. If you have a serial line attached then you should see Xen print: XEN: Platform quirk -- Disabling IRQ balancing/affinity. at some point during boot of domain0. -- Keir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 1850/2850 hangs under I/O load 2005-07-14 10:29 ` Keir Fraser 2005-07-14 10:45 ` Keir Fraser @ 2005-07-14 15:05 ` David H 2005-07-14 15:14 ` Keir Fraser 1 sibling, 1 reply; 10+ messages in thread From: David H @ 2005-07-14 15:05 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel Here is the relevant portion of irq.c, let me know if this is not correct. /* Attempt to bind the interrupt target to the correct CPU. */ cpu_set(v->processor, cpumask); /* if ( desc->handler->set_affinity != NULL ) desc->handler->set_affinity(vector, cpumask); */ On 7/14/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote: > > On 14 Jul 2005, at 02:37, David H wrote: > > > I set some aside some time, got my xen versions all sorted out, and > > did a little testing. It looks like this fixes the problem for my > > server in the 2.0.6 and 2.0-testing (not sure if there is any > > difference between the the two at this point but I thought in couldn't > > hurt to test). However, unstable from last nights tar ball still hangs > > although it to take a little longer to do so. Let me know if there is > > anything else I can test. > > Are you sure you fixed up unstable properly, and definitely were > running the fixed up version? It would be quite surprising if that fix > worked for xen2 but not for xen3! > > I just did some tests myself, and fixing that one use of set_affinity > in pirq_guest_bind ought to be sufficient to get unstable working on > your boxes. > > -- Keir > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 1850/2850 hangs under I/O load 2005-07-14 15:05 ` David H @ 2005-07-14 15:14 ` Keir Fraser 0 siblings, 0 replies; 10+ messages in thread From: Keir Fraser @ 2005-07-14 15:14 UTC (permalink / raw) To: David H; +Cc: xen-devel On 14 Jul 2005, at 16:05, David H wrote: > ere is the relevant portion of irq.c, let me know if this is not > correct. > > /* Attempt to bind the interrupt target to the correct CPU. */ > cpu_set(v->processor, cpumask); > /* if ( desc->handler->set_affinity != NULL ) > desc->handler->set_affinity(vector, cpumask); */ That's the correct bit. Odd it doesn't fix the problem on unstable. I checked in the fix anyway, and maybe we can work out what else important has changed between 2.0 and unstable. -- Keir ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-07-14 15:14 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser
2005-07-13 16:43 ` Brian Hays
2005-07-13 17:16 ` Keir Fraser
2005-07-13 19:39 ` David H
[not found] ` <c4e0079f0507131213190d6762@mail.gmail.com>
[not found] ` <6e5c7f99544bb10aa8fe6663cfd8d79e@cl.cam.ac.uk>
2005-07-14 1:37 ` David H
2005-07-14 10:29 ` Keir Fraser
2005-07-14 10:45 ` Keir Fraser
2005-07-14 11:15 ` Keir Fraser
2005-07-14 15:05 ` David H
2005-07-14 15:14 ` Keir Fraser
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.