From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving Date: Fri, 28 Mar 2014 13:02:01 -0400 Message-ID: <20140328170201.GB12659@phenom.dumpdata.com> References: <5331E269.9090708@gmail.com> <20140326095533.GA7885@deinos.phlegethon.org> <20140326101746.GA14195@u109add4315675089e695.ant.amazon.com> <20140326150801.GD18387@phenom.dumpdata.com> <20140326151507.GF14195@u109add4315675089e695.ant.amazon.com> <5332F948.1020909@gmail.com> <20140326163609.GD21368@phenom.dumpdata.com> <533312C0.1050507@gmail.com> <20140326175606.GA24179@phenom.dumpdata.com> <5333518E.40203@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WTaAa-00028D-F7 for xen-devel@lists.xenproject.org; Fri, 28 Mar 2014 17:02:16 +0000 Content-Disposition: inline In-Reply-To: <5333518E.40203@gmail.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Matthew Rushton Cc: Keir Fraser , Matt Wilson , Matt Wilson , Tim Deegan , Jan Beulich , Andrew Cooper , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: > On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: > >On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: > >>On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > >>>On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > >>>>On 03/26/14 08:15, Matt Wilson wrote: > >>>>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>>>>>Could you elaborate a bit more on the use-case please? > >>>>>>My understanding is that most drivers use a scatter gather list - in which > >>>>>>case it does not matter if the underlaying MFNs in the PFNs spare are > >>>>>>not contingous. > >>>>>> > >>>>>>But I presume the issue you are hitting is with drivers doing dma_map_page > >>>>>>and the page is not 4KB but rather large (compound page). Is that the > >>>>>>problem you have observed? > >>>>>Drivers are using very large size arguments to dma_alloc_coherent() > >>>>>for things like RX and TX descriptor rings. > >>>Large size like larger than 512kB? That would also cause problems > >>>on baremetal then when swiotlb is activated I believe. > >>I was looking at network IO performance so the buffers would not > >>have been that large. I think large in this context is relative to > >>the 4k page size and the odds of the buffer spanning a page > >>boundary. For context I saw ~5-10% performance increase with guest > >>network throughput by avoiding bounce buffers and also saw dom0 tcp > >>streaming performance go from ~6Gb/s to over 9Gb/s on my test setup > >>with a 10Gb NIC. > >OK, but that would not be the dma_alloc_coherent ones then? That sounds > >more like the generic TCP mechanism allocated 64KB pages instead of 4KB > >and used those. > > > >Did you try looking at this hack that Ian proposed a long time ago > >to verify that it is said problem? > > > >https://lkml.org/lkml/2013/9/4/540 > > > > Yes I had seen that and intially had the same reaction but the > change was relatively recent and not relevant. I *think* all the > coherent allocations are ok since the swiotlb makes them contiguous. > The problem comes with the use of the streaming api. As one example > with jumbo frames enabled a driver might use larger rx buffers which > triggers the problem. > > I think the right thing to do is to make the dma streaming api work > better with larger buffers on dom0. That way it works across all OK. > drivers and device types regardless of how they were designed. Can you point me to an example of the DMA streaming API? I am not sure if you mean 'streaming API' as scatter gather operations using DMA API? Is there a particular easy way for me to reproduce this. I have to say I hadn't enabled Jumbo frame on my box since I am not even sure if the switch I have can do it. Is there a idiots-punch-list of how to reproduce this? Thanks! > > >>>>>--msw > >>>>It's the dma streaming api I've noticed the problem with, so > >>>>dma_map_single(). Applicable swiotlb code would be > >>>>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > >>>>for larger buffers it can cause bouncing. >