All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 13:11 Driver domain - NEW issue: IRQ handling error B.G. Bruce
@ 2005-02-01 13:10 ` Mark Williamson
  2005-02-01 13:48   ` B.G. Bruce
  2005-02-01 16:13   ` B.G. Bruce
  2005-02-01 19:19 ` Anonymous
  2005-02-02  1:28 ` Christian Limpach
  2 siblings, 2 replies; 15+ messages in thread
From: Mark Williamson @ 2005-02-01 13:10 UTC (permalink / raw)
  To: xen-devel, bgb

OK, I haven't heard of this issue.  Could you post your grub.conf for dom0 and 
your domain config file for the backend?

I'm not entirely clear on your configuration - how does your networking setup 
work?  What *does* work?

Cheers,
Mark

On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> so I'm pretty sure this hasn't been addressed already.
>
> Symptom:
>
> While running a "ping -f <some local host outside the box>" from dom0
> where the physical nic is in a driver dom (bridged), after about 1
> minute the connection dies and won't restart.  (even with a reboot of
> the driver domain).
>
> ex.  Dom0 vif1.0=10.1.1.1/24
>      outside host=10.1.1.2/24
>
> 	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> nic (eth1)
>
> dmesg on dom0 gives:
> irq 18: nobody cared!
>  [<c012a4b7>]
>  [<c012a547>]
>  [<c0129f6c>]
>  [<c010cd1b>]
>  [<c0105c13>]
>  [<c0108aa3>]
>  [<c0106c05>]
>  [<c0106c39>]
>  [<c02e2621>]
> handlers:
> [<c020cdb6>]
> [<cc94b867>]
> Disabling IRQ #18
>
>
> and a dmesg of the driver domain shows that the nic hooked IRQ 18:
>
> Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> Copyright (c) 1999-2004 Intel Corporation.
> PCI: Obtained IRQ 18 for device 0000:01:01.0
> PCI: Setting latency timer of device 0000:01:01.0 to 64
> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
>
> Am I correct in that the interrupt that should have been sent to the
> driver domain was instead sent to dom0?  or what happened?  If I don't
> have the driver dom setup correctly, would someone please explain what
> I'm doing wrong?
>
> Thanks,
> B.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Driver domain - NEW issue: IRQ handling error
@ 2005-02-01 13:11 B.G. Bruce
  2005-02-01 13:10 ` Mark Williamson
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: B.G. Bruce @ 2005-02-01 13:11 UTC (permalink / raw)
  To: xen-devel

Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
so I'm pretty sure this hasn't been addressed already.  

Symptom:

While running a "ping -f <some local host outside the box>" from dom0
where the physical nic is in a driver dom (bridged), after about 1
minute the connection dies and won't restart.  (even with a reboot of
the driver domain).

ex.  Dom0 vif1.0=10.1.1.1/24
     outside host=10.1.1.2/24

	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
nic (eth1)

dmesg on dom0 gives:
irq 18: nobody cared!
 [<c012a4b7>]
 [<c012a547>]
 [<c0129f6c>]
 [<c010cd1b>]
 [<c0105c13>]
 [<c0108aa3>]
 [<c0106c05>]
 [<c0106c39>]
 [<c02e2621>]
handlers:
[<c020cdb6>]
[<cc94b867>]
Disabling IRQ #18


and a dmesg of the driver domain shows that the nic hooked IRQ 18:

Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
Copyright (c) 1999-2004 Intel Corporation.
PCI: Obtained IRQ 18 for device 0000:01:01.0
PCI: Setting latency timer of device 0000:01:01.0 to 64
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection

Am I correct in that the interrupt that should have been sent to the
driver domain was instead sent to dom0?  or what happened?  If I don't
have the driver dom setup correctly, would someone please explain what
I'm doing wrong?

Thanks,
B.


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 13:10 ` Mark Williamson
@ 2005-02-01 13:48   ` B.G. Bruce
  2005-02-01 16:13   ` B.G. Bruce
  1 sibling, 0 replies; 15+ messages in thread
From: B.G. Bruce @ 2005-02-01 13:48 UTC (permalink / raw)
  To: Mark Williamson; +Cc: xen-devel

Sure, here they are:
GRUB:
default 0
timeout 5
fallback 1
 
splashimage=(hd0,0)/grub/splash.xpm.gz
  
title=Production (xen)
root (hd0,0)
kernel /xen.gz dom0_mem=65535 console=vga
physdev_dom0_hide=(01:01.0)(02:00.0) com1=115200,8n1
module /vmlinuz-2.6.10-xen0 root=/dev/md1 ro console=tty0
video=intelfb:1280x1024-16@74,accel=1
#module /vmlinuz-2.6.10-xen0 root=/dev/md1 ro console=tty0
 
title=Recovery (2.6.10)
root (hd0,0)
kernel /vmlinuz root=/dev/md1 video=intelfb:1280x1024-16@74,accel=1
devfs=nomount 3
 
title=Single user mode (2.6.10)
root (hd0,0)
kernel /vmlinuz-2.6.10-gentoo-r6 root=/dev/md1 devfs=nomount 1
video=intelfb:1024x768-16@85,accel=1

e1000 config (xm create -nf e1000):
Using config file "/etc/xen.old/e1000".
(vm
    (name e1000)
    (memory 128)
    (restart onreboot)
    (image
        (linux
            (kernel /boot/vmlinuz-2.6.10-xen-be)
            (root '/dev/hda1 ro')
            (args 'panic=1')
        )
    )
    (device (vbd (uname phy:raid1/e1000.1) (dev /dev/hda1) (mode w)))
    (device (vbd (uname phy:raid1/portage) (dev /dev/hda2) (mode r)))
    (device (vbd (uname phy:raid0/swap_e1000) (dev /dev/hda3) (mode w)))
    (device (pci (bus 0x1) (dev 0x1) (func 0x0)))
    (device (vif (mac aa:00:01:fa:00:02) (bridge e1000)))
)


Ideally, I'd like to get front end domains hooking directly into backend
domains, however I do not seem to be able to the vifX.X to be created in
any domain other than dom0.  xen-be is a DOM0 build with the nic drivers
included.  I'm running Gentoo-dev-sources (2.6.10-r6).  Last night I
uninstalled xen, manually checked for any extraneous xen packages/code,
grabbed a fresh clone of testing, and reinstalled xen + recompiled my
kernels.

Sample of what I've tried:
Using config file "fwmgmt".
(vm
    (name fwmgmt)
    (memory 128)
    (restart onreboot)
    (image
        (linux
            (kernel /boot/vmlinuz-2.6.10-xen-fe)
            (root '/dev/hda1 ro')
            (args 'panic=1')
        )
    )
    (device (vbd (uname phy:raid1/fwmgmt) (dev /dev/hda1) (mode w)))
    (device (vbd (uname phy:raid1/portage) (dev /dev/hda2) (mode r)))
    (device (vbd (uname phy:raid0/swap_fwmgmt) (dev /dev/hda3) (mode
w)))
    (device (vif (mac aa:00:01:fa:00:04) (bridge e1000) (backend
e1000)))
    (device (vif (mac aa:00:02:fa:00:04) (bridge 3c59x) (backend
3c59x)))
    (device (vif (mac aa:00:03:fa:00:04) (bridge vsw0) (backend vsw0)))
    (device (vif (mac aa:00:04:fa:00:04) (bridge mgmt)))
)


I am, however able to bridge eth0(real) and eth1(virtual) in the e1000
driver domain and get the vifX.X in dom0.  If I assign an local (to the
physical nic) ip to that vif, I am able to see the rest of my network.



On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> OK, I haven't heard of this issue.  Could you post your grub.conf for dom0 and 
> your domain config file for the backend?
> 
> I'm not entirely clear on your configuration - how does your networking setup 
> work?  What *does* work?
> 
> Cheers,
> Mark
> 
> On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > so I'm pretty sure this hasn't been addressed already.
> >
> > Symptom:
> >
> > While running a "ping -f <some local host outside the box>" from dom0
> > where the physical nic is in a driver dom (bridged), after about 1
> > minute the connection dies and won't restart.  (even with a reboot of
> > the driver domain).
> >
> > ex.  Dom0 vif1.0=10.1.1.1/24
> >      outside host=10.1.1.2/24
> >
> > 	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > nic (eth1)
> >
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> >  [<c012a4b7>]
> >  [<c012a547>]
> >  [<c0129f6c>]
> >  [<c010cd1b>]
> >  [<c0105c13>]
> >  [<c0108aa3>]
> >  [<c0106c05>]
> >  [<c0106c39>]
> >  [<c02e2621>]
> > handlers:
> > [<c020cdb6>]
> > [<cc94b867>]
> > Disabling IRQ #18
> >
> >
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> >
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> >
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0?  or what happened?  If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
> >
> > Thanks,
> > B.
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > Tool for open source databases. Create drag-&-drop reports. Save time
> > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
> 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 13:10 ` Mark Williamson
  2005-02-01 13:48   ` B.G. Bruce
@ 2005-02-01 16:13   ` B.G. Bruce
  2005-02-01 17:01     ` B.G. Bruce
  1 sibling, 1 reply; 15+ messages in thread
From: B.G. Bruce @ 2005-02-01 16:13 UTC (permalink / raw)
  To: Mark Williamson; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

Here are the dom config files:

B.


On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> OK, I haven't heard of this issue.  Could you post your grub.conf for dom0 and 
> your domain config file for the backend?
> 
> I'm not entirely clear on your configuration - how does your networking setup 
> work?  What *does* work?
> 
> Cheers,
> Mark
> 
> On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > so I'm pretty sure this hasn't been addressed already.
> >
> > Symptom:
> >
> > While running a "ping -f <some local host outside the box>" from dom0
> > where the physical nic is in a driver dom (bridged), after about 1
> > minute the connection dies and won't restart.  (even with a reboot of
> > the driver domain).
> >
> > ex.  Dom0 vif1.0=10.1.1.1/24
> >      outside host=10.1.1.2/24
> >
> > 	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > nic (eth1)
> >
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> >  [<c012a4b7>]
> >  [<c012a547>]
> >  [<c0129f6c>]
> >  [<c010cd1b>]
> >  [<c0105c13>]
> >  [<c0108aa3>]
> >  [<c0106c05>]
> >  [<c0106c39>]
> >  [<c02e2621>]
> > handlers:
> > [<c020cdb6>]
> > [<cc94b867>]
> > Disabling IRQ #18
> >
> >
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> >
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> >
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0?  or what happened?  If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
> >
> > Thanks,
> > B.
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > Tool for open source databases. Create drag-&-drop reports. Save time
> > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> 

[-- Attachment #2: e1000 --]
[-- Type: text/plain, Size: 3076 bytes --]

#  -*- mode: python; -*-
#============================================================================
# Python configuration setup for 'xm create'.
# This script sets the parameters used when a domain is created using 'xm create'.
# You use a separate script for each domain you want to create, or 
# you can set the parameters for the domain on the xm command line.
#============================================================================

#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/boot/vmlinuz-2.6.10-xen-be"

# Optional ramdisk.
#ramdisk = "/boot/initrd.gz"

# The domain build function. Default is 'linux'.
#builder='linux'

# Initial memory allocation (in megabytes) for the new domain.
memory = 128

# A name for your domain. All domains must have different names.
name = "e1000"

# Which CPU to start domain on? 
#cpu = -1   # leave to Xen to pick

pci = [ '1,1,0' ]

#----------------------------------------------------------------------------
# Define network interfaces.

# Number of network interfaces. Default is 1.
nics=1

# Optionally define mac and/or bridge for the network interfaces.
# Random MACs are assigned if not given.

vif = [ "mac=aa:00:01:fa:00:02, bridge=e1000" ] 

#----------------------------------------------------------------------------
# Define the disk devices you want the domain to have access to, and
# what you want them accessible as.
# Each disk entry is of the form phy:UNAME,DEV,MODE
# where UNAME is the device, DEV is the device name the domain will see,
# and MODE is r for read-only, w for read-write.

#disk = [ 'phy:raid1/nic_0_e1000,/dev/hda1,w', 
disk = [ 'phy:raid1/e1000.1,/dev/hda1,w', 
	 'phy:raid1/portage,/dev/hda2,r',
	 'phy:raid0/swap_e1000,/dev/hda3,w' ]


#----------------------------------------------------------------------------
# Set the kernel command line for the new domain.
# You only need to define the IP parameters and hostname if the domain's
# IP config doesn't, e.g. in ifcfg-eth0 or via DHCP.
# You can use 'extra' to set the runlevel and custom environment
# variables used by custom rc scripts (e.g. VMID=, usr= ).

# Set if you want dhcp to allocate the IP address.
#dhcp="dhcp"
# Set netmask.
#netmask=
# Set default gateway.
#gateway=
# Set the hostname.
#hostname= "vm%d" % vmid

# Set root device.
root = "/dev/hda1 ro"

# Root device for nfs.
#root = "/dev/nfs"
# The nfs server.
#nfs_server = '169.254.1.0'  
# Root directory on the nfs server.
#nfs_root   = '/full/path/to/root/directory'

# Sets runlevel 4.
extra = "panic=1"

#----------------------------------------------------------------------------
# Set according to whether you want the domain restarted when it exits.
# The default is 'onreboot', which restarts the domain when it shuts down
# with exit code reboot.
# Other values are 'always', and 'never'.

restart = 'onreboot'

#============================================================================

[-- Attachment #3: fwmgmt --]
[-- Type: text/plain, Size: 3265 bytes --]

#  -*- mode: python; -*-
#============================================================================
# Python configuration setup for 'xm create'.
# This script sets the parameters used when a domain is created using 'xm create'.
# You use a separate script for each domain you want to create, or 
# you can set the parameters for the domain on the xm command line.
#============================================================================

#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/boot/vmlinuz-2.6.10-xen-fe"

# Optional ramdisk.
#ramdisk = "/boot/initrd.gz"

# The domain build function. Default is 'linux'.
#builder='linux'

# Initial memory allocation (in megabytes) for the new domain.
memory = 128

# A name for your domain. All domains must have different names.
name = "fwmgmt"

# Which CPU to start domain on? 
#cpu = -1   # leave to Xen to pick

#----------------------------------------------------------------------------
# Define network interfaces.

# Number of network interfaces. Default is 1.
#nics=1

# Optionally define mac and/or bridge for the network interfaces.
# Random MACs are assigned if not given.

vif = [ "mac=aa:00:01:fa:00:04, bridge=e1000, backend=e1000", "mac=aa:00:02:fa:00:04, bridge=3c59x, backend=3c59x", "mac=aa:00:03:fa:00:04, bridge=vsw0, backend=vsw0", "mac=aa:00:04:fa:00:04, bridge=mgmt" ] 
#vif = [ "mac=aa:00:01:fa:00:04, bridge=e1000, backend=1", "mac=aa:00:04:fa:00:04, bridge=mgmt" ] 

#----------------------------------------------------------------------------
# Define the disk devices you want the domain to have access to, and
# what you want them accessible as.
# Each disk entry is of the form phy:UNAME,DEV,MODE
# where UNAME is the device, DEV is the device name the domain will see,
# and MODE is r for read-only, w for read-write.

disk = [ 'phy:raid1/fwmgmt,/dev/hda1,w', 
	 'phy:raid1/portage,/dev/hda2,r',
	 'phy:raid0/swap_fwmgmt,/dev/hda3,w' ]

#----------------------------------------------------------------------------
# Set the kernel command line for the new domain.
# You only need to define the IP parameters and hostname if the domain's
# IP config doesn't, e.g. in ifcfg-eth0 or via DHCP.
# You can use 'extra' to set the runlevel and custom environment
# variables used by custom rc scripts (e.g. VMID=, usr= ).

# Set if you want dhcp to allocate the IP address.
#dhcp="dhcp"
# Set netmask.
#netmask=
# Set default gateway.
#gateway=
# Set the hostname.
#hostname= "vm%d" % vmid

# Set root device.
root = "/dev/hda1 ro"

# Root device for nfs.
#root = "/dev/nfs"
# The nfs server.
#nfs_server = '169.254.1.0'  
# Root directory on the nfs server.
#nfs_root   = '/full/path/to/root/directory'

# Sets runlevel 4.
extra = "panic=1"

#----------------------------------------------------------------------------
# Set according to whether you want the domain restarted when it exits.
# The default is 'onreboot', which restarts the domain when it shuts down
# with exit code reboot.
# Other values are 'always', and 'never'.

restart = 'onreboot'

#============================================================================

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 16:13   ` B.G. Bruce
@ 2005-02-01 17:01     ` B.G. Bruce
  2005-02-02  1:01       ` B.G. Bruce
  0 siblings, 1 reply; 15+ messages in thread
From: B.G. Bruce @ 2005-02-01 17:01 UTC (permalink / raw)
  To: Mark Williamson; +Cc: xen-devel

Most disconcerting is that if I perform the same ping flood from the
non-xen'd local box back to either dom0 or an ip assigned to the bridge
in the driver domain, eventually (about 100,000 packets) the same result
will occur.  The nic dies.  I can still ping between Dom0 and the driver
domain, but there is no outside traffic.  When I run this against a
stock linux kernel, there is no issue (1,000,000+ packets).

B.

On Tue, 2005-02-01 at 12:13, B.G. Bruce wrote:
> Here are the dom config files:
> 
> B.
> 
> 
> On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> > OK, I haven't heard of this issue.  Could you post your grub.conf for dom0 and 
> > your domain config file for the backend?
> > 
> > I'm not entirely clear on your configuration - how does your networking setup 
> > work?  What *does* work?
> > 
> > Cheers,
> > Mark
> > 
> > On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > > so I'm pretty sure this hasn't been addressed already.
> > >
> > > Symptom:
> > >
> > > While running a "ping -f <some local host outside the box>" from dom0
> > > where the physical nic is in a driver dom (bridged), after about 1
> > > minute the connection dies and won't restart.  (even with a reboot of
> > > the driver domain).
> > >
> > > ex.  Dom0 vif1.0=10.1.1.1/24
> > >      outside host=10.1.1.2/24
> > >
> > > 	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > > nic (eth1)
> > >
> > > dmesg on dom0 gives:
> > > irq 18: nobody cared!
> > >  [<c012a4b7>]
> > >  [<c012a547>]
> > >  [<c0129f6c>]
> > >  [<c010cd1b>]
> > >  [<c0105c13>]
> > >  [<c0108aa3>]
> > >  [<c0106c05>]
> > >  [<c0106c39>]
> > >  [<c02e2621>]
> > > handlers:
> > > [<c020cdb6>]
> > > [<cc94b867>]
> > > Disabling IRQ #18
> > >
> > >
> > > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> > >
> > > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > > Copyright (c) 1999-2004 Intel Corporation.
> > > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> > >
> > > Am I correct in that the interrupt that should have been sent to the
> > > driver domain was instead sent to dom0?  or what happened?  If I don't
> > > have the driver dom setup correctly, would someone please explain what
> > > I'm doing wrong?
> > >
> > > Thanks,
> > > B.
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > > Tool for open source databases. Create drag-&-drop reports. Save time
> > > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> > 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 13:11 Driver domain - NEW issue: IRQ handling error B.G. Bruce
  2005-02-01 13:10 ` Mark Williamson
@ 2005-02-01 19:19 ` Anonymous
  2005-02-02  1:28 ` Christian Limpach
  2 siblings, 0 replies; 15+ messages in thread
From: Anonymous @ 2005-02-01 19:19 UTC (permalink / raw)
  To: xen-devel

I also get this, but it doesn't look like the network card but the USB.

Dom0 dmesg:

PCI: Obtained IRQ 23 for device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: EHCI Host Controller
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: irq 23, pci mem 0xfe700800
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: USB 2.0 initialized, EHCI 1.00, driver 26 Oct 2004
hub 1-0:1.0: USB hub found

...


irq 23: nobody cared!
 [<c013590a>] __report_bad_irq+0x2a/0xa0
 [<c01350f0>] handle_IRQ_event+0x40/0x90
 [<c0135a10>] note_interrupt+0x70/0xb0
 [<c0135268>] __do_IRQ+0x128/0x140
 [<c010eed9>] do_IRQ+0x19/0x30
 [<c0105f0f>] evtchn_do_upcall+0xaf/0x110
 [<c0109927>] hypervisor_callback+0x37/0x40
 [<c02a996f>] e1000_intr+0x1f/0x90
 [<c0105e5a>] force_evtchn_callback+0xa/0x10
 [<c01350f0>] handle_IRQ_event+0x40/0x90
 [<c0135211>] __do_IRQ+0xd1/0x140
 [<c010eed9>] do_IRQ+0x19/0x30
 [<c0105f0f>] evtchn_do_upcall+0xaf/0x110
 [<c0109927>] hypervisor_callback+0x37/0x40
 [<c010738e>] xen_idle+0x8e/0x150
 [<c0463209>] preempt_schedule+0x29/0x50
 [<c0107479>] cpu_idle+0x29/0x50
 [<c05767c8>] start_kernel+0x178/0x1c0
 [<c0576350>] unknown_bootoption+0x0/0x1e0
handlers:
[<c0390de0>] (usb_hcd_irq+0x0/0x70)
Disabling IRQ #23



This is a xen-2.0.3, 2.6.10 kernel that I built myself.
Almost everything built-in.





-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 17:01     ` B.G. Bruce
@ 2005-02-02  1:01       ` B.G. Bruce
  0 siblings, 0 replies; 15+ messages in thread
From: B.G. Bruce @ 2005-02-02  1:01 UTC (permalink / raw)
  To: Mark Williamson; +Cc: xen-devel

Okay, after a little RTFM, I have the the frontend domain putting it's
vif in the correct backend domain (not dom0) and the backend domain
properly configured to be a backend domain (backend(netif)) in the sxp
config.  Don't I feel silly. However, running the ping test still kills
the nic.  If I run it from the front end domain, I get the disabling IRQ
18 error message in that dom.  If I run it external to the physical box,
the NIC still dies, however if I run the NICs from dom0, everything is
fine.  For testing purposes, I'm using the SAME xen0 for booting both
xen0 and the backend domain.

Regards,
B.


On Tue, 2005-02-01 at 13:01, B.G. Bruce wrote:
> Most disconcerting is that if I perform the same ping flood from the
> non-xen'd local box back to either dom0 or an ip assigned to the bridge
> in the driver domain, eventually (about 100,000 packets) the same result
> will occur.  The nic dies.  I can still ping between Dom0 and the driver
> domain, but there is no outside traffic.  When I run this against a
> stock linux kernel, there is no issue (1,000,000+ packets).
> 
> B.
> 
> On Tue, 2005-02-01 at 12:13, B.G. Bruce wrote:
> > Here are the dom config files:
> > 
> > B.
> > 
> > 
> > On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> > > OK, I haven't heard of this issue.  Could you post your grub.conf for dom0 and 
> > > your domain config file for the backend?
> > > 
> > > I'm not entirely clear on your configuration - how does your networking setup 
> > > work?  What *does* work?
> > > 
> > > Cheers,
> > > Mark
> > > 
> > > On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > > > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > > > so I'm pretty sure this hasn't been addressed already.
> > > >
> > > > Symptom:
> > > >
> > > > While running a "ping -f <some local host outside the box>" from dom0
> > > > where the physical nic is in a driver dom (bridged), after about 1
> > > > minute the connection dies and won't restart.  (even with a reboot of
> > > > the driver domain).
> > > >
> > > > ex.  Dom0 vif1.0=10.1.1.1/24
> > > >      outside host=10.1.1.2/24
> > > >
> > > > 	e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > > > nic (eth1)
> > > >
> > > > dmesg on dom0 gives:
> > > > irq 18: nobody cared!
> > > >  [<c012a4b7>]
> > > >  [<c012a547>]
> > > >  [<c0129f6c>]
> > > >  [<c010cd1b>]
> > > >  [<c0105c13>]
> > > >  [<c0108aa3>]
> > > >  [<c0106c05>]
> > > >  [<c0106c39>]
> > > >  [<c02e2621>]
> > > > handlers:
> > > > [<c020cdb6>]
> > > > [<cc94b867>]
> > > > Disabling IRQ #18
> > > >
> > > >
> > > > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> > > >
> > > > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > > > Copyright (c) 1999-2004 Intel Corporation.
> > > > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > > > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > > > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> > > >
> > > > Am I correct in that the interrupt that should have been sent to the
> > > > driver domain was instead sent to dom0?  or what happened?  If I don't
> > > > have the driver dom setup correctly, would someone please explain what
> > > > I'm doing wrong?
> > > >
> > > > Thanks,
> > > > B.
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > > > Tool for open source databases. Create drag-&-drop reports. Save time
> > > > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > > > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> > > 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
> 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-01 13:11 Driver domain - NEW issue: IRQ handling error B.G. Bruce
  2005-02-01 13:10 ` Mark Williamson
  2005-02-01 19:19 ` Anonymous
@ 2005-02-02  1:28 ` Christian Limpach
  2005-02-02  2:09   ` B.G. Bruce
  2 siblings, 1 reply; 15+ messages in thread
From: Christian Limpach @ 2005-02-02  1:28 UTC (permalink / raw)
  To: bgb; +Cc: xen-devel

On Tue, 01 Feb 2005 09:11:34 -0400, B.G. Bruce <bgb@nt-nv.com> wrote:
> dmesg on dom0 gives:
> irq 18: nobody cared!
> Disabling IRQ #18
> 
> and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> 
> Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> Copyright (c) 1999-2004 Intel Corporation.
> PCI: Obtained IRQ 18 for device 0000:01:01.0
> PCI: Setting latency timer of device 0000:01:01.0 to 64
> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection

Do you have any other devices which get assigned IRQ 18?  Xen prints
information about the interrupt routing when it starts, you can read
this information with xm dmesg.  Also lspci -v should show for each
device not hidden from dom0 which interrupt is used by the device. 
FWIW, I've seen "irq nobody cared" on the IRQ assigned to the USB
controller and a kernel without USB support.

> Am I correct in that the interrupt that should have been sent to the
> driver domain was instead sent to dom0?  or what happened?  If I don't
> have the driver dom setup correctly, would someone please explain what
> I'm doing wrong?

Yes, it should have been sent to the driver domain.  If there's a 2nd
device on IRQ 18 and this device is not hidden from dom0 and the 2nd
device gets an interrupt, it will go to dom0.  This should be harmless
but apparently, it's not.

     christian


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02  1:28 ` Christian Limpach
@ 2005-02-02  2:09   ` B.G. Bruce
  2005-02-02  2:47     ` Christian Limpach
  0 siblings, 1 reply; 15+ messages in thread
From: B.G. Bruce @ 2005-02-02  2:09 UTC (permalink / raw)
  To: Christian.Limpach; +Cc: xen-devel

Yeah, the SATA and PATA contollers are grabbing 18 as well.  This box
doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
the bios), but what do I do about the PATA/EIDE (00:1f.1).  I need it!

B.



On Tue, 2005-02-01 at 21:28, Christian Limpach wrote:
> On Tue, 01 Feb 2005 09:11:34 -0400, B.G. Bruce <bgb@nt-nv.com> wrote:
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> > Disabling IRQ #18
> > 
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> > 
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> 
> Do you have any other devices which get assigned IRQ 18?  Xen prints
> information about the interrupt routing when it starts, you can read
> this information with xm dmesg.  Also lspci -v should show for each
> device not hidden from dom0 which interrupt is used by the device. 
> FWIW, I've seen "irq nobody cared" on the IRQ assigned to the USB
> controller and a kernel without USB support.
> 
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0?  or what happened?  If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
> 
> Yes, it should have been sent to the driver domain.  If there's a 2nd
> device on IRQ 18 and this device is not hidden from dom0 and the 2nd
> device gets an interrupt, it will go to dom0.  This should be harmless
> but apparently, it's not.
> 
>      christian
> 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02  2:09   ` B.G. Bruce
@ 2005-02-02  2:47     ` Christian Limpach
  2005-02-02 17:48       ` B.G. Bruce
  2005-02-02 18:00       ` B.G. Bruce
  0 siblings, 2 replies; 15+ messages in thread
From: Christian Limpach @ 2005-02-02  2:47 UTC (permalink / raw)
  To: B.G. Bruce; +Cc: xen-devel

On Tue, Feb 01, 2005 at 10:09:45PM -0400, B.G. Bruce wrote:
> Yeah, the SATA and PATA contollers are grabbing 18 as well.  This box
> doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
> the bios), but what do I do about the PATA/EIDE (00:1f.1).  I need it!

Have you tried with SATA excluded?  The PATA controller could be ok
because it will have a driver handling the interrupts.  Otherwise you
could try moving the card around (to a different slot) and see if it
gets a different IRQ.

    christian



-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02  2:47     ` Christian Limpach
@ 2005-02-02 17:48       ` B.G. Bruce
  2005-02-02 18:00       ` B.G. Bruce
  1 sibling, 0 replies; 15+ messages in thread
From: B.G. Bruce @ 2005-02-02 17:48 UTC (permalink / raw)
  To: Christian Limpach; +Cc: xen-devel

Yes, I've tried with the SATA excluded - no change.

B.

On Tue, 2005-02-01 at 22:47, Christian Limpach wrote:
> On Tue, Feb 01, 2005 at 10:09:45PM -0400, B.G. Bruce wrote:
> > Yeah, the SATA and PATA contollers are grabbing 18 as well.  This box
> > doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
> > the bios), but what do I do about the PATA/EIDE (00:1f.1).  I need it!
> 
> Have you tried with SATA excluded?  The PATA controller could be ok
> because it will have a driver handling the interrupts.  Otherwise you
> could try moving the card around (to a different slot) and see if it
> gets a different IRQ.
> 
>     christian
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
> 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02  2:47     ` Christian Limpach
  2005-02-02 17:48       ` B.G. Bruce
@ 2005-02-02 18:00       ` B.G. Bruce
  2005-02-02 20:04         ` Christian Limpach
  1 sibling, 1 reply; 15+ messages in thread
From: B.G. Bruce @ 2005-02-02 18:00 UTC (permalink / raw)
  To: Christian Limpach; +Cc: xen-devel

What I've actually found is that if is disable the disabling of the
interrupt in kernel/irq/spurious.c, everything works fine.  I still get
the error messages (I didn't comment them out) about every
50.000-100.000 packets but I don't drop a packet and everything works as
it should.  Now obviously I don't want to keep the disabling irq
disabled, but I'm at a loss for how to fix this otherwise.

B.


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02 18:00       ` B.G. Bruce
@ 2005-02-02 20:04         ` Christian Limpach
  2005-02-02 21:26           ` B.G. Bruce
  0 siblings, 1 reply; 15+ messages in thread
From: Christian Limpach @ 2005-02-02 20:04 UTC (permalink / raw)
  To: B.G. Bruce; +Cc: xen-devel, Keir Fraser

On Wed, Feb 02, 2005 at 02:00:18PM -0400, B.G. Bruce wrote:
> What I've actually found is that if is disable the disabling of the
> interrupt in kernel/irq/spurious.c, everything works fine.  I still get
> the error messages (I didn't comment them out) about every
> 50.000-100.000 packets but I don't drop a packet and everything works as
> it should.  Now obviously I don't want to keep the disabling irq
> disabled, but I'm at a loss for how to fix this otherwise.

I think I understand now what's happening:
- since you have devices on IRQ18 in both dom0 and another domain,
  all IRQ18 interrupts get delivered to both (for loop in
  __do_IRQ_guest in xen/arch/x86/irq.c).
- the ide driver in dom0 will only handle IRQs for the ide controller.
- all e1000 interrupts will be counted as spurious/unhandled.
- if there's hardly any ide interrupts, you can hit the case where
  of 100000 interrupts, 99900 were unhandled and this will cause the
  interrupt to get disabled.

We seem to hit the case mentioned in () above __report_bad_irq.  Not
disabling the interrupt in that case is the correct thing to do, but
the sharing does certainly have a significant performance impact.

   christian



-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Driver domain - NEW issue: IRQ handling error
  2005-02-02 20:04         ` Christian Limpach
@ 2005-02-02 21:26           ` B.G. Bruce
  0 siblings, 0 replies; 15+ messages in thread
From: B.G. Bruce @ 2005-02-02 21:26 UTC (permalink / raw)
  To: xen-devel

Christian,

Thank you for your time and patience in helping me to understand what is
going on.  It is greatly appreciated.

B.


On Wed, 2005-02-02 at 16:04, Christian Limpach wrote:
> On Wed, Feb 02, 2005 at 02:00:18PM -0400, B.G. Bruce wrote:
> > What I've actually found is that if is disable the disabling of the
> > interrupt in kernel/irq/spurious.c, everything works fine.  I still get
> > the error messages (I didn't comment them out) about every
> > 50.000-100.000 packets but I don't drop a packet and everything works as
> > it should.  Now obviously I don't want to keep the disabling irq
> > disabled, but I'm at a loss for how to fix this otherwise.
> 
> I think I understand now what's happening:
> - since you have devices on IRQ18 in both dom0 and another domain,
>   all IRQ18 interrupts get delivered to both (for loop in
>   __do_IRQ_guest in xen/arch/x86/irq.c).
> - the ide driver in dom0 will only handle IRQs for the ide controller.
> - all e1000 interrupts will be counted as spurious/unhandled.
> - if there's hardly any ide interrupts, you can hit the case where
>   of 100000 interrupts, 99900 were unhandled and this will cause the
>   interrupt to get disabled.
> 
> We seem to hit the case mentioned in () above __report_bad_irq.  Not
> disabling the interrupt in that case is the correct thing to do, but
> the sharing does certainly have a significant performance impact.
> 
>    christian
> 
> 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Driver domain - NEW issue: IRQ handling error
@ 2005-02-02 22:43 Ian Pratt
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Pratt @ 2005-02-02 22:43 UTC (permalink / raw)
  To: Christian Limpach, B.G. Bruce; +Cc: xen-devel, Keir Fraser

> > What I've actually found is that if is disable the disabling of the
> > interrupt in kernel/irq/spurious.c, everything works fine.  
> I still get
> > the error messages (I didn't comment them out) about every
> > 50.000-100.000 packets but I don't drop a packet and 
> everything works as
> > it should.  Now obviously I don't want to keep the disabling irq
> > disabled, but I'm at a loss for how to fix this otherwise.
> 
> I think I understand now what's happening:
> - since you have devices on IRQ18 in both dom0 and another domain,
>   all IRQ18 interrupts get delivered to both (for loop in
>   __do_IRQ_guest in xen/arch/x86/irq.c).
> - the ide driver in dom0 will only handle IRQs for the ide controller.
> - all e1000 interrupts will be counted as spurious/unhandled.
> - if there's hardly any ide interrupts, you can hit the case where
>   of 100000 interrupts, 99900 were unhandled and this will cause the
>   interrupt to get disabled.
> 
> We seem to hit the case mentioned in () above __report_bad_irq.  Not
> disabling the interrupt in that case is the correct thing to do, but
> the sharing does certainly have a significant performance impact.

Nicely figured -- I was putting this down to ioapic folklore and just
hoping it was going to go away with new ioapic code...

Shared interrupts across multiple domains are heinous.

Ian


 


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2005-02-02 22:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-01 13:11 Driver domain - NEW issue: IRQ handling error B.G. Bruce
2005-02-01 13:10 ` Mark Williamson
2005-02-01 13:48   ` B.G. Bruce
2005-02-01 16:13   ` B.G. Bruce
2005-02-01 17:01     ` B.G. Bruce
2005-02-02  1:01       ` B.G. Bruce
2005-02-01 19:19 ` Anonymous
2005-02-02  1:28 ` Christian Limpach
2005-02-02  2:09   ` B.G. Bruce
2005-02-02  2:47     ` Christian Limpach
2005-02-02 17:48       ` B.G. Bruce
2005-02-02 18:00       ` B.G. Bruce
2005-02-02 20:04         ` Christian Limpach
2005-02-02 21:26           ` B.G. Bruce
  -- strict thread matches above, loose matches on Subject: below --
2005-02-02 22:43 Ian Pratt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.