From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Dom0 physical networking/swiotlb/something issue in 3.7-rc1 Date: Fri, 12 Oct 2012 07:59:50 -0400 Message-ID: <20121012115949.GB4028@localhost.localdomain> References: <1350037688.14806.93.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1350037688.14806.93.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: xen-devel List-Id: xen-devel@lists.xenproject.org On Fri, Oct 12, 2012 at 11:28:08AM +0100, Ian Campbell wrote: > Hi Konrad, > > The following patch causes fairly large packet loss when transmitting > from dom0 to the physical network, at least with my tg3 hardware, but I > assume it can impact anything which uses this interface. Ah, that would explain why one of my machines suddenly started developing checksum errors (and had a tg3 card). I hadn't gotten deep into it. > > I suspect that the issue is that the compound pages allocated in this > way are not backed by contiguous mfns and so things fall apart when the > driver tries to do DMA. So this should also be easily reproduced on barmetal with 'iommu=soft' then. > > However I don't understand why the swiotlb is not fixing this up > successfully? The tg3 driver seems to use pci_map_single on this data. > Any thoughts? Perhaps the swiotlb (either generically or in the Xen > backend) doesn't correctly handle compound pages? The assumption is that it is just a page. I am surprsed that the other IOMMUs aren't hitting this as well - ah, that is b/c they do handle a virtual address of more than one PAGE_SIZE.. > > Ideally we would also fix this at the point of allocation to avoid the > bouncing -- I suppose that would involve using the DMA API in > netdev_alloc_frag? Using pci_alloc_coherent would do it.. but > > We have a, sort of, similar situation in the block layer which is solved > via BIOVEC_PHYS_MERGEABLE. Sadly I don't think anything similar can > easily be retrofitted to the net drivers without changing every single > one. .. I think the right way would be to fix the SWIOTLB. And since I am now officially the maintainer of said subsystem you have come to the right person! What is the easiest way of reproducing this? Just doing large amount of netperf/netserver traffic both ways? > > Ian. > > commit 69b08f62e17439ee3d436faf0b9a7ca6fffb78db > Author: Eric Dumazet > Date: Wed Sep 26 06:46:57 2012 +0000 > > net: use bigger pages in __netdev_alloc_frag > > We currently use percpu order-0 pages in __netdev_alloc_frag > to deliver fragments used by __netdev_alloc_skb() > > Depending on NIC driver and arch being 32 or 64 bit, it allows a page to > be split in several fragments (between 1 and 8), assuming PAGE_SIZE=4096 > > Switching to bigger pages (32768 bytes for PAGE_SIZE=4096 case) allows : > > - Better filling of space (the ending hole overhead is less an issue) > > - Less calls to page allocator or accesses to page->_count > > - Could allow struct skb_shared_info futures changes without major > performance impact. > > This patch implements a transparent fallback to smaller > pages in case of memory pressure. > > It also uses a standard "struct page_frag" instead of a custom one. > > Signed-off-by: Eric Dumazet > Cc: Alexander Duyck > Cc: Benjamin LaHaise > Signed-off-by: David S. Miller > > >