From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chuck Ebbert Subject: Re: [PATCH 1/2] x86: don't unnecessarily call dma_alloc_from_contiguous() Date: Sun, 28 Sep 2014 15:41:15 -0500 Message-ID: <20140928154115.6d77541d@as> References: <1411919524-3666-1-git-send-email-akinobu.mita@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1411919524-3666-1-git-send-email-akinobu.mita@gmail.com> Sender: linux-kernel-owner@vger.kernel.org To: Akinobu Mita Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, Peter Hurley , Marek Szyprowski , Konrad Rzeszutek Wilk , David Woodhouse , Don Dutile , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Yinghai Lu , x86@kernel.org, iommu@lists.linux-foundation.org List-Id: iommu@lists.linux-foundation.org On Mon, 29 Sep 2014 00:52:03 +0900 Akinobu Mita wrote: > If CONFIG_DMA_CMA is enabled, dma_generic_alloc_coherent() tries to > allocate memory region by dma_alloc_from_contiguous() before trying to > use alloc_pages(). > > This wastes CMA region by small DMA-coherent buffers which can be > allocated by alloc_pages(). And it also causes performance degradation, > as this is trying to drive _all_ dma mapping allocations through a > _very_ small window, reported by Peter Hurley. > > This fixes it by trying to allocate by alloc_pages() first in > dma_generic_alloc_coherent() as dma_alloc_from_contiguous should be > called only for huge allocation. > > Signed-off-by: Akinobu Mita > Reported-by: Peter Hurley > Cc: Peter Hurley > Cc: Marek Szyprowski > Cc: Konrad Rzeszutek Wilk > Cc: David Woodhouse > Cc: Don Dutile > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: Andi Kleen > Cc: Yinghai Lu > Cc: x86@kernel.org > Cc: iommu@lists.linux-foundation.org > --- > arch/x86/kernel/pci-dma.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > index a25e202..0402266 100644 > --- a/arch/x86/kernel/pci-dma.c > +++ b/arch/x86/kernel/pci-dma.c > @@ -99,20 +99,20 @@ void *dma_generic_alloc_coherent(struct device *dev, size_t size, > > flag &= ~__GFP_ZERO; > again: > - page = NULL; > + page = alloc_pages_node(dev_to_node(dev), flag | __GFP_NOWARN, > + get_order(size)); Shouldn't you be doing this only in the case where order is relatively small? Like < PAGE_ALLOC_COSTLY_ORDER or so? > /* CMA can be used only in the context which permits sleeping */ > - if (flag & __GFP_WAIT) { > + if (!page && (flag & __GFP_WAIT)) { > page = dma_alloc_from_contiguous(dev, count, get_order(size)); > if (page && page_to_phys(page) + size > dma_mask) { > dma_release_from_contiguous(dev, page, count); > page = NULL; > } > } > - /* fallback */ > - if (!page) > - page = alloc_pages_node(dev_to_node(dev), flag, get_order(size)); > - if (!page) > + if (!page) { > + warn_alloc_failed(flag, get_order(size), NULL); > return NULL; > + } > > addr = page_to_phys(page); > if (addr + size > dma_mask) {