All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Matt Wilson <msw@linux.com>
Cc: Keir Fraser <keir@xen.org>, Matt Wilson <msw@amazon.com>,
	Matthew Rushton <mvrushton@gmail.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>, Jan Beulich <jbeulich@suse.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving
Date: Wed, 26 Mar 2014 11:08:01 -0400	[thread overview]
Message-ID: <20140326150801.GD18387@phenom.dumpdata.com> (raw)
In-Reply-To: <20140326101746.GA14195@u109add4315675089e695.ant.amazon.com>

On Wed, Mar 26, 2014 at 12:17:53PM +0200, Matt Wilson wrote:
> On Wed, Mar 26, 2014 at 10:55:33AM +0100, Tim Deegan wrote:
> > Hi,
> > 
> > At 13:09 -0700 on 25 Mar (1395749353), Matthew Rushton wrote:
> > > On 03/25/14 06:27, Matt Wilson wrote:
> > > > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote:
> > > >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote:
> > > >>> From: Matt Rushton <mrushton@amazon.com>
> > > >>>
> > > >>> This patch makes the Xen heap allocator use the first half of higher
> > > >>> order chunks instead of the second half when breaking them down for
> > > >>> smaller order allocations.
> > > >>>
> > > >>> Linux currently remaps the memory overlapping PCI space one page at a
> > > >>> time. Before this change this resulted in the mfns being allocated in
> > > >>> reverse order and led to discontiguous dom0 memory. This forced dom0
> > > >>> to use bounce buffers for doing DMA and resulted in poor performance.
> > > >> This seems like something better fixed on the dom0 side, by asking
> > > >> explicitly for contiguous memory in cases where it makes a difference.
> > > >> On the Xen side, this change seems harmless, but we might like to keep
> > > >> the explicitly reversed allocation on debug builds, to flush out
> > > >> guests that rely on their memory being contiguous.
> > > > Yes, I think that retaining the reverse allocation on debug builds is
> > > > fine. I'd like Konrad's take on if it's better or possible to fix this
> > > > on the Linux side.
> > > 
> > > I considered fixing it in Linux but this was a more straight forward 
> > > change with no downside as far as I can tell. I see no reason in not 
> > > fixing it in both places but this at least behaves more reasonably for 
> > > one potential use case. I'm also interested in other opinions.
> > 
> > Well, I'm happy enough with changing Xen (though it's common code so
> > you'll need Keir's ack anyway rather than mine), since as you say it
> > happens to make one use case a bit better and is otherwise harmless.
> > But that comes with a stinking great warning:
> 
> Anyone can Ack or Nack, but I wouldn't want to move forward on a
> change like this without Keir's Ack. :-)
> 
> >  - This is not 'fixing' anything in Xen because Xen is doing exactly
> >    what dom0 asks for in the current code; and conversely
> >
> >  - dom0 (and other guests) _must_not_ rely on it, whether for
> >    performance or correctness.  Xen might change its page allocator at
> >    some point in the future, for any reason, and if linux perf starts
> >    sucking when that happens, that's (still) a linux bug.
> 
> I agree with both of these. This was just the "least change" patch to
> a particular problem we observed.
> 
> Konrad, what's the possibility of fixing this in Linux Xen PV setup
> code? I think it'd be a matter batching up pages and doing larger
> order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(),
> falling back to smaller pages if allocations fail due to
> fragmentation, etc.

Could you elaborate a bit more on the use-case please?
My understanding is that most drivers use a scatter gather list - in which
case it does not matter if the underlaying MFNs in the PFNs spare are
not contingous.

But I presume the issue you are hitting is with drivers doing dma_map_page
and the page is not 4KB but rather large (compound page). Is that the
problem you have observed?

Thanks.
> 
> --msw

  parent reply	other threads:[~2014-03-26 15:08 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-25 11:22 [RFC PATCH] page_alloc: use first half of higher order chunks when halving Matt Wilson
2014-03-25 11:44 ` Andrew Cooper
2014-03-25 13:20   ` Matt Wilson
2014-03-25 20:18     ` Matthew Rushton
2014-03-25 12:19 ` Tim Deegan
2014-03-25 13:27   ` Matt Wilson
2014-03-25 20:09     ` Matthew Rushton
2014-03-26  9:55       ` Tim Deegan
2014-03-26 10:17         ` Matt Wilson
2014-03-26 10:44           ` David Vrabel
2014-03-26 10:48             ` Matt Wilson
2014-03-26 11:13               ` Ian Campbell
2014-03-26 11:41                 ` Matt Wilson
2014-03-26 11:45                   ` Andrew Cooper
2014-03-26 11:50                     ` Matt Wilson
2014-03-26 12:43               ` David Vrabel
2014-03-26 12:48                 ` Matt Wilson
2014-03-26 15:08           ` Konrad Rzeszutek Wilk [this message]
2014-03-26 15:15             ` Matt Wilson
2014-03-26 15:59               ` Matthew Rushton
2014-03-26 16:36                 ` Konrad Rzeszutek Wilk
2014-03-26 17:47                   ` Matthew Rushton
2014-03-26 17:56                     ` Konrad Rzeszutek Wilk
2014-03-26 22:15                       ` Matthew Rushton
2014-03-28 17:02                         ` Konrad Rzeszutek Wilk
2014-03-28 22:06                           ` Matthew Rushton
2014-03-31 14:15                             ` Konrad Rzeszutek Wilk
2014-04-01  3:25                               ` Matthew Rushton
2014-04-01 10:48                                 ` Konrad Rzeszutek Wilk
2014-04-01 12:22                                   ` Tim Deegan
2014-04-02  0:17                                     ` Matthew Rushton
2014-04-02  7:52                                       ` Jan Beulich
2014-04-02 10:06                                         ` Ian Campbell
2014-04-02 10:15                                           ` Jan Beulich
2014-04-02 10:20                                             ` Ian Campbell
2014-04-09 22:21                                               ` Matthew Rushton
2014-04-10  6:14                                                 ` Jan Beulich
2014-04-11 20:20                                                   ` Matthew Rushton
2014-04-11 17:05                                                 ` Konrad Rzeszutek Wilk
2014-04-11 20:28                                                   ` Matthew Rushton
2014-04-12  1:34                                                     ` Konrad Rzeszutek Wilk
2014-04-13 21:32                                                   ` Tim Deegan
2014-04-14  8:51                                                     ` Jan Beulich
2014-04-14 14:40                                                       ` Konrad Rzeszutek Wilk
2014-04-14 15:34                                                         ` Jan Beulich
2014-04-16 14:15                                                           ` Konrad Rzeszutek Wilk
2014-04-17  1:34                                                             ` Matthew Rushton
2014-05-07 23:16                                                             ` Matthew Rushton
2014-05-08 18:05                                                               ` Konrad Rzeszutek Wilk
2014-05-14 15:06                                                               ` Konrad Rzeszutek Wilk
2014-05-20 19:26                                                                 ` Matthew Rushton
2014-05-23 19:00                                                                   ` Konrad Rzeszutek Wilk
2014-06-04 22:25                                                                     ` Matthew Rushton
2014-06-05  9:32                                                                       ` David Vrabel
2014-03-26 16:34               ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140326150801.GD18387@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=keir@xen.org \
    --cc=msw@amazon.com \
    --cc=msw@linux.com \
    --cc=mvrushton@gmail.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.