From mboxrd@z Thu Jan 1 00:00:00 1970 From: Charles Duffy Subject: Sporadic loss of networking (kvm-70, e1000, tap) Date: Mon, 07 Jul 2008 12:32:53 -0500 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010801000501020006000406" To: kvm@vger.kernel.org Return-path: Received: from main.gmane.org ([80.91.229.2]:50307 "EHLO ciao.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753835AbYGGRdI (ORCPT ); Mon, 7 Jul 2008 13:33:08 -0400 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1KFuaG-0003RS-UX for kvm@vger.kernel.org; Mon, 07 Jul 2008 17:33:04 +0000 Received: from rrcs-71-41-149-67.sw.biz.rr.com ([71.41.149.67]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 07 Jul 2008 17:33:04 +0000 Received: from Charles_Duffy by rrcs-71-41-149-67.sw.biz.rr.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 07 Jul 2008 17:33:04 +0000 Sender: kvm-owner@vger.kernel.org List-ID: This is a multi-part message in MIME format. --------------010801000501020006000406 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Several times over the last few days, I've lost network connectivity from one of my guests. This has happened only during interactive sessions in which I take an action resulting in a large screen update. I have tried flood pinging (only with the default, small packet size), and not been able to reproduce under those circumstances. This guest is running a current RHEL5/CentOS kernel (2.6.18-53.1.13.el5) with clocksource=acpi_pm on the command line. The host side of the network is a tap device, which is joined to a bridge. Other VMs on the bridge still have working bidirectional networking, and dmesg on the host shows the relevant port on the bridge in forwarding state. Rebooting the guest without shutting down the kvm instance does not resolve the issue. Powering down the VM and starting a new kvm instance *does* resolve the issue. Within the guest, tcpdump sees both incoming and outgoing traffic; however, on the host, only traffic going *to* the guest is visible; traffic the guest attempts to send is not visible. When attempting to send, none of the counters (RX packets/TX packets/etc) increase on the emulated e1000 device within the guest; RX bytes and TX bytes are both 0, and all the error counters are likewise zeroed. The e1000 module can be reloaded without any visible errors. Where should I start in attempting to debug this? --------------010801000501020006000406 Content-Type: text/plain; name="ethtool_-d_eth0.failed.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ethtool_-d_eth0.failed.txt" MAC Registers ------------- 0x00000: CTRL (Device control register) 0x00140240 Endian mode (buffers): little Link reset: normal Set link up: 1 Invert Loss-Of-Signal: no Receive flow control: disabled Transmit flow control: disabled VLAN mode: disabled Auto speed detect: disabled Speed select: 1000Mb/s Force speed: no Force duplex: no 0x00008: STATUS (Device status register) 0x80080783 Duplex: full Link up: link config TBI mode: disabled Link speed: 1000Mb/s Bus type: PCI Bus speed: 33MHz Bus width: 32-bit 0x00100: RCTL (Receive control register) 0x00008002 Receiver: enabled Store bad packets: disabled Unicast promiscuous: disabled Multicast promiscuous: disabled Long packet: disabled Descriptor minimum threshold size: 1/2 Broadcast accept mode: accept VLAN filter: disabled Cononical form indicator: disabled Discard pause frames: filtered Pass MAC control frames: don't pass Receive buffer size: 2048 0x02808: RDLEN (Receive desc length) 0x00000000 0x02810: RDH (Receive desc head) 0x00000000 0x02818: RDT (Receive desc tail) 0x000000FE 0x02820: RDTR (Receive delay timer) 0x00000000 0x00400: TCTL (Transmit ctrl register) 0x0103F0FA Transmitter: enabled Pad short packets: enabled Software XOFF Transmission: disabled Re-transmit on late collision: enabled 0x03808: TDLEN (Transmit desc length) 0x00000000 0x03810: TDH (Transmit desc head) 0x0000000F 0x03818: TDT (Transmit desc tail) 0x0000000F 0x03820: TIDV (Transmit delay timer) 0x00000000 PHY type: M88 --------------010801000501020006000406--