All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Birger Tödtmann" <btoedtmann@iem.uni-due.de>
To: Keir Fraser <Keir.Fraser@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com, xen-users@lists.xensource.com
Subject: Re: kernel oops/IRQ exception when networking between many domUs
Date: Mon, 06 Jun 2005 10:52:25 +0200	[thread overview]
Message-ID: <1118047945.1972.9.camel@lomin> (raw)
In-Reply-To: <49e83a846cc77d6605f4adc2c0f34858@cl.cam.ac.uk>

Am Montag, den 06.06.2005, 09:23 +0100 schrieb Keir Fraser:
> On 5 Jun 2005, at 17:57, Birger Toedtmann wrote:
> 
> > Apparently it is happening somewhere here:
> >
> > [...]
> > 0xc028cbe5 <net_rx_action+1135>:        test   %eax,%eax
> > 0xc028cbe7 <net_rx_action+1137>:        je     0xc028ca82 
> > <net_rx_action+780>
> > 0xc028cbed <net_rx_action+1143>:        mov    %esi,%eax
> > 0xc028cbef <net_rx_action+1145>:        shr    $0xc,%eax
> > 0xc028cbf2 <net_rx_action+1148>:        mov    %eax,(%esp)
> > 0xc028cbf5 <net_rx_action+1151>:        call   0xc028c4c4 <free_mfn>
> > 0xc028cbfa <net_rx_action+1156>:        mov    $0xffffffff,%ecx
> > ^^^^^^^^^^
> 
> Most likely the driver has tried to send a bogus page to a domU. 
> Because it's bogus the transfer fails. The driver then tries to free 
> the page back to Xen, but that also fails because the page is bogus. 
> This confuses the driver, which then BUG()s out.

I commented out the free_mfn() and status= lines: the kernel now reports
the following after it configured the 10th domU and ~80th vif, with
approx. 20-25 bridges up.  Just an idea: the number of vifs + bridges is
somewhere around the magic 128 (NR_IRQS problem in 2.0.x!) when the
crash happens - could this hint to something?


[...]
Jun  6 10:12:14 lomin kernel: 10.2.23.8: port 2(vif10.3) entering
forwarding state
Jun  6 10:12:14 lomin kernel: 10.2.35.16: topology change detected,
propagating
Jun  6 10:12:14 lomin kernel: 10.2.35.16: port 2(vif10.4) entering
forwarding state
Jun  6 10:12:14 lomin kernel: 10.2.35.20: topology change detected,
propagating
Jun  6 10:12:14 lomin kernel: 10.2.35.20: port 2(vif10.5) entering
forwarding state
Jun  6 10:12:20 lomin kernel: c014cea4
Jun  6 10:12:20 lomin kernel:  [do_page_fault+643/1665] do_page_fault
+0x469/0x738
Jun  6 10:12:20 lomin kernel:  [<c0115720>] do_page_fault+0x469/0x738
Jun  6 10:12:20 lomin kernel:  [fixup_4gb_segment+2/12] page_fault
+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [<c0109a7e>] page_fault+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [do_page_fault+49/1665] do_page_fault
+0x217/0x738
Jun  6 10:12:20 lomin kernel:  [<c01154ce>] do_page_fault+0x217/0x738
Jun  6 10:12:20 lomin kernel:  [fixup_4gb_segment+2/12] page_fault
+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [<c0109a7e>] page_fault+0x2e/0x34
Jun  6 10:12:20 lomin kernel: PREEMPT
Jun  6 10:12:20 lomin kernel: Modules linked in: dm_snapshot pcmcia
bridge ipt_REJECT ipt_state iptable_filter ipt_MASQUERADE iptable_nat
ip_conntrack ip_tables autofs4 snd_seq snd_seq_device evdev usbhid
rfcomm l2cap bluetooth dm_mod cryptoloop snd_pcm_oss snd_mixer_oss
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd soundcore
snd_page_alloc tun uhci_hcd usb_storage usbcore irtty_sir sir_dev
ircomm_tty ircomm irda yenta_socket rsrc_nonstatic pcmcia_core 3c59x
Jun  6 10:12:20 lomin kernel: CPU:    0
Jun  6 10:12:20 lomin kernel: EIP:    0061:[do_wp_page+622/1175]    Not
tainted VLI
Jun  6 10:12:20 lomin kernel: EIP:    0061:[<c014cea4>]    Not tainted
VLI
Jun  6 10:12:20 lomin kernel: EFLAGS: 00010206   (2.6.11.11-xen0)
Jun  6 10:12:20 lomin kernel: EIP is at handle_mm_fault+0x5d/0x222
Jun  6 10:12:20 lomin kernel: eax: 15555b18   ebx: d8788000   ecx:
00000b18   edx: 15555b18
Jun  6 10:12:20 lomin kernel: esi: dcfc3b4c   edi: dcaf5580   ebp:
d8789ee4   esp: d8789ebc
Jun  6 10:12:20 lomin kernel: ds: 0069   es: 0069   ss: 0069
Jun  6 10:12:20 lomin kernel: Process python (pid: 4670,
threadinfo=d8788000 task=de1a1520)
Jun  6 10:12:20 lomin kernel: Stack: 00000040 00000001 d40e687c d40e6874
00000006 d40e685c d8789f14 dcaf5580
Jun  6 10:12:20 lomin kernel:        dcaf55ac d40e6b1c d8789fbc c01154ce
dcaf5580 d40e6b1c b4ec6ff0 00000001
Jun  6 10:12:20 lomin kernel:        00000001 de1a1520 b4ec6ff0 00000006
d8789fc4 d8789fc4 c03405b0 00000006
Jun  6 10:12:20 lomin kernel: Call Trace:
Jun  6 10:12:20 lomin kernel:  [dump_stack+16/32] show_stack+0x80/0x96
Jun  6 10:12:20 lomin kernel:  [<c0109c51>] show_stack+0x80/0x96
Jun  6 10:12:20 lomin kernel:  [show_registers+384/457] show_registers
+0x15a/0x1d1
Jun  6 10:12:20 lomin kernel:  [<c0109de1>] show_registers+0x15a/0x1d1
Jun  6 10:12:20 lomin kernel:  [die+301/458] die+0x106/0x1c4
Jun  6 10:12:20 lomin kernel:  [<c010a001>] die+0x106/0x1c4
Jun  6 10:12:20 lomin kernel:  [do_page_fault+675/1665] do_page_fault
+0x489/0x738
Jun  6 10:12:20 lomin kernel:  [<c0115740>] do_page_fault+0x489/0x738
Jun  6 10:12:20 lomin kernel:  [fixup_4gb_segment+2/12] page_fault
+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [<c0109a7e>] page_fault+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [do_page_fault+49/1665] do_page_fault
+0x217/0x738
Jun  6 10:12:20 lomin kernel:  [<c01154ce>] do_page_fault+0x217/0x738
Jun  6 10:12:20 lomin kernel:  [fixup_4gb_segment+2/12] page_fault
+0x2e/0x34
Jun  6 10:12:20 lomin kernel:  [<c0109a7e>] page_fault+0x2e/0x34
Jun  6 10:12:20 lomin kernel: Code: 8b 47 1c c1 ea 16 83 43 14 01 8d 34
90 85 f6 0f 84 52 01 00 00 89 f2 8b 4d 10 89 f8 e8 4a d1 ff ff 85 c0 89
c2 0f 84 3c 01 00 00 <8b> 00 a8 81 75 3d 85 c0 0f 84 01 01 00 00 a8 40
0f 84 a4 00 00


> 
> It's not at all clear where the bogus address comes from: the driver 
> basically just reads the address out of an skbuff, and converts it from 
> virtual to physical address. But something is obviously going wrong, 
> perhaps under memory pressure. :-(

Where, within the domUs or dom0?  The latter has lots of memory at hand,
the domU are quite strapped of memory.  I'll try to find out...


Regards,
-- 
Birger Tödtmann
Technik der Rechnernetze, Institut für Experimentelle Mathematik
Universität Duisburg-Essen, Campus Essen email:btoedtmann@iem.uni-due.de
skype:birger.toedtmann pgp:0x6FB166C9

  reply	other threads:[~2005-06-06  8:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1117904746.7507.31.camel@lomin>
     [not found] ` <b60a57e1c8d95c01eb0c5b383b9b8e18@cl.cam.ac.uk>
2005-06-06  6:42   ` kernel oops/IRQ exception when networking between many domUs Birger Toedtmann
     [not found]   ` <20050605165716.GA1231@exp-math.uni-essen.de>
2005-06-06  8:23     ` [Xen-devel] " Keir Fraser
2005-06-06  8:52       ` Birger Tödtmann [this message]
2005-06-06  8:56         ` Birger Tödtmann
2005-06-06  9:26         ` Keir Fraser
2005-06-06 12:30           ` Birger Tödtmann
2005-06-07 16:46             ` Nils Toedtmann
2005-06-07 16:47             ` Nils Toedtmann
2005-06-08 12:34               ` Nils Toedtmann
2005-06-08 14:40                 ` Nils Toedtmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1118047945.1972.9.camel@lomin \
    --to=btoedtmann@iem.uni-due.de \
    --cc=Keir.Fraser@cl.cam.ac.uk \
    --cc=xen-devel@lists.xensource.com \
    --cc=xen-users@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.