From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Re: Still struggling with HVM: tx timeouts on emulated nics Date: Thu, 22 Sep 2011 16:32:54 +0200 Message-ID: <4E7B4716.1050108@canonical.com> References: <4E79E08D.1090503@canonical.com> <4E7A0410.7050405@canonical.com> <4E7B2301.1050004@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4E7B2301.1050004@canonical.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 22.09.2011 13:58, Stefan Bader wrote: > On 22.09.2011 12:30, Stefano Stabellini wrote: >> On Wed, 21 Sep 2011, Stefan Bader wrote: >>> On 21.09.2011 15:31, Stefano Stabellini wrote: >>>> On Wed, 21 Sep 2011, Stefan Bader wrote: >>>>> This is on 3.0.4 based dom0 and domU with 4.1.1 hypervisor. I tried using the >>>>> default 8139cp and ne2k_pci emulated nic. The 8139cp one at least comes up and >>>>> gets configured via dhcp. And initial pings also get routed and done correctly. >>>>> But slightly higher traffic (like checking for updates) hangs. And after a while >>>>> there are messages about tx timeouts. >>>>> The ne2k_pci type nic almost immediately has those issues and never comes up >>>>> correctly. >>>>> >>>>> I am attaching the dmesg of the guest with apic=debug enabled. I am not sure how >>>>> this should be but both nics get configured with level,low IRQs. Disk emulation >>>>> seems to be ok but that seem to use IO-APIC-edge. And any other IRQs seem to be >>>>> at least not level. >>>> >>> >>>> Does the e1000 emulated card work correctly? >>> >>> Yes, that one seems to work ok. >>> >>>> What happens if you disable interrupt remapping (see patch below)? >>> >>> 8139cp seems to work correctly now (much higher irq stats as well) and e1000 >>> still works. Both then using IOAPIC-fasteoi. >>> >> >> That means there must be another subtle bug in Xen in interrupt >> remapping that only affects 8139p emulation >> > Right, or to be complete: > - e1000: ok > - 8139cp: unstable (setup is possible) > - ne2k_pci: not working (tx problems from the beginning) > > The behaviour feels a bit like interrupts may get lost if occurring at a higher > rate. Why this affects various drivers differently is a bit weird. >> This is mainly speculating... Quite a while back there was this patch to events: commit dffe2e1e1a1ddb566a76266136c312801c66dcf7 Author: Jeremy Fitzhardinge Date: Fri Aug 20 19:10:01 2010 -0700 xen: handle events as edge-triggered The commit message stated that Xen events are logically edge triggered. So PV events were changed to be handled as edge interrupts. Would that not mean that for xen-pirq-apic being using events this would apply the same and those should be apic-edge instead of level? >>>>> Another problem came up recently though that may just be me doing the wrong >>>>> thing. Normally I boot with xen_emul_unplug=unnecessary as I want the emulated >>>>> devices. xen-blkfront is a module in my case and I thought I once had been able >>>>> to use that by removing the unplug arg and making the blkfront driver load. But >>>>> when I recently tried the module loaded but no disks appeared... Again, not sure >>>>> I just forgot how to do that right or that was different when using a 4.1.0 >>>>> hypervisor still... >>>> >>>> xen_emul_unplug=unnecessary allows the kernel to use PV interfaces on >>>> older hypervisors that didn't support the unplug protocol and had other >>>> ways to cope with multiple drivers accessing the same devices. >>>> You can use xen_emul_unplug=never to prevent any unplug but you won't >>>> get any PV interfaces. >>> >>> Hm, odd. Somehow I thought that I had been using pv interfaces that way when the >>> interrupts for the emulated ide was broken. >>> A bit suboptimal atm, because without any option and a kernel compiled with the >>> platform pci and pv drivers (as modules) booting in HVM mode the kernel decides >>> that having both is no use and unplugs the emulated devices. Which then leaves >>> you with ... none. >> >> In theory you would have the PV frontend modules in the initrd. >> On the other hand having both can easily cause data corruptions on your >> drive. > > They _are_ in the initrd. And the boot rightfully drops to a maintenance shell > right now (without any argument and the emulated devices unplugged). And > "modprobe xen-blkfront" loads the module but it does _not_ detect any pv device. > > -Stefan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel