All of lore.kernel.org
 help / color / mirror / Atom feed
* 1850/2850 hangs under I/O load
@ 2005-07-13 15:55 Keir Fraser
  2005-07-13 16:43 ` Brian Hays
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-13 15:55 UTC (permalink / raw)
  To: rob, barryf-lists, davidh.davidh; +Cc: xen-devel


Looking back over the emails on this topic, someone pointed out a
patch for Linux 2.6.10 that disabled software IRQ affinity for
1850/2850 systems.

You can try a similar fix on Xen (either 2.0.x or unstable) by editing
arch/x86/irq.c:pirq_guest_bind(), and remove the following lines:

    if ( desc->handler->set_affinity != NULL )
        desc->handler->set_affinity(<blah>);

If this fixes the I/O hangs for you, it is a nicer fix than
ignorebiostables. I can add a boot parameter to have the same effect,
and also probably have the fix applied automatically for 1850/2850
systems in the unstable tree (just like Linux).

Let me know how it works out.

 -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser
@ 2005-07-13 16:43 ` Brian Hays
  2005-07-13 17:16   ` Keir Fraser
  2005-07-13 19:39 ` David H
       [not found] ` <c4e0079f0507131213190d6762@mail.gmail.com>
  2 siblings, 1 reply; 10+ messages in thread
From: Brian Hays @ 2005-07-13 16:43 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, barryf-lists, davidh.davidh, rob

Hi all,

I'm also about to install Xen on a Poweredge 1850. Is this the
recommended set up (patch referenced below) for that hardware? If so,
is there any thing else that may take a hit as far as performance or
reliablity after making the change suggested?

Thank you,
Brian

On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:
> 
> Looking back over the emails on this topic, someone pointed out a
> patch for Linux 2.6.10 that disabled software IRQ affinity for
> 1850/2850 systems.
> 
> You can try a similar fix on Xen (either 2.0.x or unstable) by editing
> arch/x86/irq.c:pirq_guest_bind(), and remove the following lines:
> 
>     if ( desc->handler->set_affinity != NULL )
>         desc->handler->set_affinity(<blah>);
> 
> If this fixes the I/O hangs for you, it is a nicer fix than
> ignorebiostables. I can add a boot parameter to have the same effect,
> and also probably have the fix applied automatically for 1850/2850
> systems in the unstable tree (just like Linux).
> 
> Let me know how it works out.
> 
>  -- Keir
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-13 16:43 ` Brian Hays
@ 2005-07-13 17:16   ` Keir Fraser
  0 siblings, 0 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-13 17:16 UTC (permalink / raw)
  To: Brian Hays; +Cc: xen-devel, barryf-lists, davidh.davidh, rob


On 13 Jul 2005, at 17:43, Brian Hays wrote:

> I'm also about to install Xen on a Poweredge 1850. Is this the
> recommended set up (patch referenced below) for that hardware? If so,
> is there any thing else that may take a hit as far as performance or
> reliablity after making the change suggested?

The patch | just posted is a new one I've put up for testing and 
comments.

The usual fixes that people find to work currently are to specify 
'nousb' on domain0's command line, or 'ignorebiostables' on Xen's 
command line.

  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser
  2005-07-13 16:43 ` Brian Hays
@ 2005-07-13 19:39 ` David H
       [not found] ` <c4e0079f0507131213190d6762@mail.gmail.com>
  2 siblings, 0 replies; 10+ messages in thread
From: David H @ 2005-07-13 19:39 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, barryf-lists, rob

I have some supermicro systems with the same chipset/problem as the
1850/2850.  This patch appears to fix the problem for me.  I could
always hang the server by using scp to copy a large file.  I have been
able to scp this file 10 times without a hang!!  However, it looks
like we lose ACPI.  Is that the expected outcome?

cat /proc/interrups before and after:
          CPU0
 1:         10        Phys-irq  i8042
 9:          0        Phys-irq  acpi
 12:        101        Phys-irq  i8042
 15:    4412096        Phys-irq  ide1
 16:      94555        Phys-irq  uhci_hcd, uhci_hcd
 18:    6161124        Phys-irq  uhci_hcd
 19:          0        Phys-irq  uhci_hcd
 48:      84003        Phys-irq  3w-xxxx
 54:    3091939        Phys-irq  eth0
256:          0     Dynamic-irq  ctrl-if
257:   22704931     Dynamic-irq  timer0
258:          0     Dynamic-irq  console
259:          0     Dynamic-irq  net-be-dbg
NMI:          0
LOC:          0
ERR:          0
MIS:          0

          CPU0
 1:         10        Phys-irq  i8042
 12:        101        Phys-irq  i8042
 15:      24398        Phys-irq  ide1
 16:     288202        Phys-irq  uhci_hcd, uhci_hcd
 18:    7522500        Phys-irq  uhci_hcd
 19:          0        Phys-irq  uhci_hcd
 48:      92704        Phys-irq  3w-xxxx
 54:    7764230        Phys-irq  eth0
128:          1     Dynamic-irq  misdirect
129:          0     Dynamic-irq  ctrl-if
130:     239113     Dynamic-irq  timer
131:          0     Dynamic-irq  console
132:          0     Dynamic-irq  net-be-dbg
NMI:          0
ERR:          0

David



On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:
> 
> Looking back over the emails on this topic, someone pointed out a
> patch for Linux 2.6.10 that disabled software IRQ affinity for
> 1850/2850 systems.
> 
> You can try a similar fix on Xen (either 2.0.x or unstable) by editing
> arch/x86/irq.c:pirq_guest_bind(), and remove the following lines:
> 
>     if ( desc->handler->set_affinity != NULL )
>         desc->handler->set_affinity(<blah>);
> 
> If this fixes the I/O hangs for you, it is a nicer fix than
> ignorebiostables. I can add a boot parameter to have the same effect,
> and also probably have the fix applied automatically for 1850/2850
> systems in the unstable tree (just like Linux).
> 
> Let me know how it works out.
> 
>  -- Keir
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
       [not found]   ` <6e5c7f99544bb10aa8fe6663cfd8d79e@cl.cam.ac.uk>
@ 2005-07-14  1:37     ` David H
  2005-07-14 10:29       ` Keir Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: David H @ 2005-07-14  1:37 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

Replying to the list, I forgot to reply all to the last emial:

You are correct of course.  Trying to do too many things at once...

I set some aside some time, got my xen versions all sorted out, and
did a little testing.  It looks like this fixes the problem for my
server in the 2.0.6 and 2.0-testing (not sure if there is any
difference between the the two at this point but I thought in couldn't
hurt to test). However, unstable from last nights tar ball still hangs
although it to take a little longer to do so.  Let me know if there is
anything else I can test.

Thanks you, and sorry for the earlier ACPI "crazy talk". :)

David

On 7/13/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:
> 
> On 13 Jul 2005, at 20:13, David H wrote:
> 
> > I have some supermicro systems with the same chipset/problem as the
> > 1850/2850.  This patch appears to fix the problem for me.  I could
> > always hang the server by using scp to copy a large file.  I have been
> > able to scp this file 10 times without a hang!!  However, it looks
> > like we lose ACPI.  Is that the expected outcome?
> 
> This is great to hear. Also, the patch cannot possibly have any effect
> at all on use of ACPI -- what makes you think ACPI usage has changed?
> Maybe you are using a different kernel config?
> 
>   -- Keir
> 
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-14  1:37     ` David H
@ 2005-07-14 10:29       ` Keir Fraser
  2005-07-14 10:45         ` Keir Fraser
  2005-07-14 15:05         ` David H
  0 siblings, 2 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-14 10:29 UTC (permalink / raw)
  To: David H; +Cc: xen-devel


On 14 Jul 2005, at 02:37, David H wrote:

> I set some aside some time, got my xen versions all sorted out, and
> did a little testing.  It looks like this fixes the problem for my
> server in the 2.0.6 and 2.0-testing (not sure if there is any
> difference between the the two at this point but I thought in couldn't
> hurt to test). However, unstable from last nights tar ball still hangs
> although it to take a little longer to do so.  Let me know if there is
> anything else I can test.

Are you sure you fixed up unstable properly, and definitely were 
running the fixed up version? It would be quite surprising if that fix 
worked for xen2 but not for xen3!

I just did some tests myself, and fixing that one use of set_affinity 
in pirq_guest_bind ought to be sufficient to get unstable working on 
your boxes.

  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: 1850/2850 hangs under I/O load
  2005-07-14 10:29       ` Keir Fraser
@ 2005-07-14 10:45         ` Keir Fraser
  2005-07-14 11:15           ` Keir Fraser
  2005-07-14 15:05         ` David H
  1 sibling, 1 reply; 10+ messages in thread
From: Keir Fraser @ 2005-07-14 10:45 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, David H


On 14 Jul 2005, at 11:29, Keir Fraser wrote:

> Are you sure you fixed up unstable properly, and definitely were 
> running the fixed up version? It would be quite surprising if that fix 
> worked for xen2 but not for xen3!
>
> I just did some tests myself, and fixing that one use of set_affinity 
> in pirq_guest_bind ought to be sufficient to get unstable working on 
> your boxes.

I also just added a new boot parameter 'noirqbalance' to the 
2.0-testing and unstable repositories. If you're using the latest 
version of either of those then just add that to Xen's command line 
instead of manually patching the code.

You can tell if you have a recent enough source tree: 
xen/arch/x86/irq.c will have a new opt_noirqbalance variable declared 
right at the top.

  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: 1850/2850 hangs under I/O load
  2005-07-14 10:45         ` Keir Fraser
@ 2005-07-14 11:15           ` Keir Fraser
  0 siblings, 0 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-14 11:15 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, David H

>> Are you sure you fixed up unstable properly, and definitely were 
>> running the fixed up version? It would be quite surprising if that 
>> fix worked for xen2 but not for xen3!
>>
>> I just did some tests myself, and fixing that one use of set_affinity 
>> in pirq_guest_bind ought to be sufficient to get unstable working on 
>> your boxes.
>
> I also just added a new boot parameter 'noirqbalance' to the 
> 2.0-testing and unstable repositories. If you're using the latest 
> version of either of those then just add that to Xen's command line 
> instead of manually patching the code.
>
> You can tell if you have a recent enough source tree: 
> xen/arch/x86/irq.c will have a new opt_noirqbalance variable declared 
> right at the top.

I've also checked in automatic disabling of IRQ balancing/affinity in 
the unstable tree, so with the latest repository you shouldn't even 
have to add an explicit boot parameter.

If you have a serial line attached then you should see Xen print:
  XEN: Platform quirk -- Disabling IRQ balancing/affinity.

at some point during boot of domain0.

  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-14 10:29       ` Keir Fraser
  2005-07-14 10:45         ` Keir Fraser
@ 2005-07-14 15:05         ` David H
  2005-07-14 15:14           ` Keir Fraser
  1 sibling, 1 reply; 10+ messages in thread
From: David H @ 2005-07-14 15:05 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

Here is the relevant portion of irq.c, let me know if this is not correct.

        /* Attempt to bind the interrupt target to the correct CPU. */
        cpu_set(v->processor, cpumask);
        /* if ( desc->handler->set_affinity != NULL )
            desc->handler->set_affinity(vector, cpumask); */
 
On 7/14/05, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:
> 
> On 14 Jul 2005, at 02:37, David H wrote:
> 
> > I set some aside some time, got my xen versions all sorted out, and
> > did a little testing.  It looks like this fixes the problem for my
> > server in the 2.0.6 and 2.0-testing (not sure if there is any
> > difference between the the two at this point but I thought in couldn't
> > hurt to test). However, unstable from last nights tar ball still hangs
> > although it to take a little longer to do so.  Let me know if there is
> > anything else I can test.
> 
> Are you sure you fixed up unstable properly, and definitely were
> running the fixed up version? It would be quite surprising if that fix
> worked for xen2 but not for xen3!
> 
> I just did some tests myself, and fixing that one use of set_affinity
> in pirq_guest_bind ought to be sufficient to get unstable working on
> your boxes.
> 
>   -- Keir
> 
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 1850/2850 hangs under I/O load
  2005-07-14 15:05         ` David H
@ 2005-07-14 15:14           ` Keir Fraser
  0 siblings, 0 replies; 10+ messages in thread
From: Keir Fraser @ 2005-07-14 15:14 UTC (permalink / raw)
  To: David H; +Cc: xen-devel


On 14 Jul 2005, at 16:05, David H wrote:

> ere is the relevant portion of irq.c, let me know if this is not 
> correct.
>
>         /* Attempt to bind the interrupt target to the correct CPU. */
>         cpu_set(v->processor, cpumask);
>         /* if ( desc->handler->set_affinity != NULL )
>             desc->handler->set_affinity(vector, cpumask); */

That's the correct bit. Odd it doesn't fix the problem on unstable. I 
checked in the fix anyway, and maybe we can work out what else 
important has changed between 2.0 and unstable.

  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-07-14 15:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-13 15:55 1850/2850 hangs under I/O load Keir Fraser
2005-07-13 16:43 ` Brian Hays
2005-07-13 17:16   ` Keir Fraser
2005-07-13 19:39 ` David H
     [not found] ` <c4e0079f0507131213190d6762@mail.gmail.com>
     [not found]   ` <6e5c7f99544bb10aa8fe6663cfd8d79e@cl.cam.ac.uk>
2005-07-14  1:37     ` David H
2005-07-14 10:29       ` Keir Fraser
2005-07-14 10:45         ` Keir Fraser
2005-07-14 11:15           ` Keir Fraser
2005-07-14 15:05         ` David H
2005-07-14 15:14           ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.