All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Webb <chris@arachsys.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0
Date: Tue, 3 Apr 2012 13:42:18 +0100	[thread overview]
Message-ID: <20120403124217.GN1283@arachsys.com> (raw)
In-Reply-To: <CAJSP0QV53ceKOMvUhvsTsb5wOwKQYmLRdqnze8oeE0DhtiFh0w@mail.gmail.com>

Stefan Hajnoczi <stefanha@gmail.com> writes:

> In a case like this it might be most effective to catch a VM in the
> bad state and then go in with gdb to see what is broken.  The basic
> approach would be putting breakpoints on the e1000 device model's
> transmit/receive paths to see if the guest is giving us packets and
> whether the tap device is transmitting/receiving.  If guest and host
> appear to be working then QEMU's e1000 model must be in a bad state
> and it's a question of looking at the tx/rx rings and other hardware
> emulation state to figure out what went wrong.

Hi Stefan. I tried setting a breakpoint on start_xmit, but the qemu blew up
when I hit it:

(gdb) break /home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c:start_xmit
Function "start_xmit" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) break /home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c:528       
Breakpoint 1 at 0x46dcd6: file /home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c, line 528.
(gdb) cont
Continuing.

Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.

I assume this is some subtlety with breakpointing threaded code?

However, along these lines, I note that the guest appears to have received
packets, though this count is stuck at 1993 bytes. The TX count marches upwards
as I ping outbound from the guest.

If I attach a tcpdump to tap1 on the host, I see the ARP requests going out and
apparently no reply:

0024# tcpdump -i tap1
tcpdump: WARNING: tap1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap1, link-type EN10MB (Ethernet), capture size 65535 bytes
12:08:35.654992 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:36.654976 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:37.654975 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:38.670933 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:39.670922 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:40.670908 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28

Looking on br0, I do seem to see the replies:

12:12:53.509471 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:12:53.509914 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at 00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:54.509455 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:12:54.509875 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at 00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:55.509447 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:12:55.509878 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at 00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:56.525424 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:12:56.525854 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at 00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:57.525408 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:12:57.525837 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at 00:13:c3:35:a6:42 (oui Unknown), length 46

but they never get to tap1 despite STP being disabled and no bridge port
filtering:

  # ebtables -L
  Bridge table: filter

  Bridge chain: INPUT, entries: 0, policy: ACCEPT

  Bridge chain: FORWARD, entries: 0, policy: ACCEPT

  Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

  # brctl show br0
  bridge name     bridge id               STP enabled     interfaces
  br0             8000.002590224ffa       no              eth0


This looks uncannily like a kernel problem doesn't it? However, remove the
-usbdevice tablet, and it goes away, which is truly weird! I've just done a
hundred successful reboots without it once again to confirm to myself that I'm
definitely not imagining that behaviour.

> Have you tried unloading the e1000 kernel module inside the guest and
> then modprobing it again?  Does this "fix" the issue?

Hadn't thought of that, but no, it apparently has no effect. It's still broken
after I rmmod it, modprobe it again, and reconfigure the networking.

Cheers,

Chris.

  reply	other threads:[~2012-04-03 12:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-02 15:37 [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0 Chris Webb
2012-04-03  7:13 ` Stefan Hajnoczi
2012-04-03  8:13   ` Chris Webb
2012-04-03  8:26     ` Stefan Hajnoczi
2012-04-03 12:42       ` Chris Webb [this message]
2012-04-03 13:34         ` Stefan Hajnoczi
2012-04-03 13:41           ` Chris Webb
2012-04-03 15:36             ` Stefan Hajnoczi
2012-04-03 16:37               ` Chris Webb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120403124217.GN1283@arachsys.com \
    --to=chris@arachsys.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.