From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Rushton Subject: Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving Date: Tue, 25 Mar 2014 13:09:13 -0700 Message-ID: <5331E269.9090708@gmail.com> References: <1395746524-9670-1-git-send-email-msw@linux.com> <20140325121922.GB12931@deinos.phlegethon.org> <20140325132740.GB11708@u109add4315675089e695.ant.amazon.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WSXex-0003rB-Fo for xen-devel@lists.xenproject.org; Tue, 25 Mar 2014 20:09:19 +0000 Received: by mail-pd0-f195.google.com with SMTP id fp1so333419pdb.2 for ; Tue, 25 Mar 2014 13:09:14 -0700 (PDT) In-Reply-To: <20140325132740.GB11708@u109add4315675089e695.ant.amazon.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Matt Wilson , Tim Deegan Cc: Keir Fraser , Jan Beulich , Andrew Cooper , Matt Wilson , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On 03/25/14 06:27, Matt Wilson wrote: > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: >>> From: Matt Rushton >>> >>> This patch makes the Xen heap allocator use the first half of higher >>> order chunks instead of the second half when breaking them down for >>> smaller order allocations. >>> >>> Linux currently remaps the memory overlapping PCI space one page at a >>> time. Before this change this resulted in the mfns being allocated in >>> reverse order and led to discontiguous dom0 memory. This forced dom0 >>> to use bounce buffers for doing DMA and resulted in poor performance. >> This seems like something better fixed on the dom0 side, by asking >> explicitly for contiguous memory in cases where it makes a difference. >> On the Xen side, this change seems harmless, but we might like to keep >> the explicitly reversed allocation on debug builds, to flush out >> guests that rely on their memory being contiguous. > Yes, I think that retaining the reverse allocation on debug builds is > fine. I'd like Konrad's take on if it's better or possible to fix this > on the Linux side. I considered fixing it in Linux but this was a more straight forward change with no downside as far as I can tell. I see no reason in not fixing it in both places but this at least behaves more reasonably for one potential use case. I'm also interested in other opinions. >>> This change more gracefully handles the dom0 use case and returns >>> contiguous memory for subsequent allocations. >>> >>> Cc: xen-devel@lists.xenproject.org >>> Cc: Keir Fraser >>> Cc: Jan Beulich >>> Cc: Konrad Rzeszutek Wilk >>> Cc: Tim Deegan >>> Cc: Andrew Cooper >>> Signed-off-by: Matt Rushton >>> Signed-off-by: Matt Wilson >>> --- >>> xen/common/page_alloc.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >>> index 601319c..27e7f18 100644 >>> --- a/xen/common/page_alloc.c >>> +++ b/xen/common/page_alloc.c >>> @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( >>> /* We may have to halve the chunk a number of times. */ >>> while ( j != order ) >>> { >>> - PFN_ORDER(pg) = --j; >>> + struct page_info *pg2; >>> + pg2 = pg + (1 << --j); >>> + PFN_ORDER(pg) = j; >>> page_list_add_tail(pg, &heap(node, zone, j)); >>> - pg += 1 << j; >> AFAICT this uses the low half (pg) for the allocation _and_ puts it on >> the freelist, and just leaks the high half (pg2). Am I missing something? > Argh, oops. this is totally my fault (not Matt R.'s). I ported the > patch out of our development tree incorrectly. The code should have > read: > > while ( j != order ) > { > struct page_info *pg2; > > pg2 = pg + (1 << --j); > PFN_ORDER(pg2) = j; > page_list_add_tail(pg2, &heap(node, zone, j)); > } > > Apologies to Matt for my mangling of his patch (which also already had > the correct blank line per Andy's comment). > > --msw No worries I was about to correct you:)