* [RFC PATCH] page_alloc: use first half of higher order chunks when halving @ 2014-03-25 11:22 Matt Wilson 2014-03-25 11:44 ` Andrew Cooper 2014-03-25 12:19 ` Tim Deegan 0 siblings, 2 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-25 11:22 UTC (permalink / raw) To: xen-devel Cc: Keir Fraser, Matt Wilson, Andrew Cooper, Tim Deegan, Matt Rushton, Jan Beulich From: Matt Rushton <mrushton@amazon.com> This patch makes the Xen heap allocator use the first half of higher order chunks instead of the second half when breaking them down for smaller order allocations. Linux currently remaps the memory overlapping PCI space one page at a time. Before this change this resulted in the mfns being allocated in reverse order and led to discontiguous dom0 memory. This forced dom0 to use bounce buffers for doing DMA and resulted in poor performance. This change more gracefully handles the dom0 use case and returns contiguous memory for subsequent allocations. Cc: xen-devel@lists.xenproject.org Cc: Keir Fraser <keir@xen.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tim Deegan <tim@xen.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Matt Rushton <mrushton@amazon.com> Signed-off-by: Matt Wilson <msw@amazon.com> --- xen/common/page_alloc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 601319c..27e7f18 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( /* We may have to halve the chunk a number of times. */ while ( j != order ) { - PFN_ORDER(pg) = --j; + struct page_info *pg2; + pg2 = pg + (1 << --j); + PFN_ORDER(pg) = j; page_list_add_tail(pg, &heap(node, zone, j)); - pg += 1 << j; } ASSERT(avail[node][zone] >= request); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 11:22 [RFC PATCH] page_alloc: use first half of higher order chunks when halving Matt Wilson @ 2014-03-25 11:44 ` Andrew Cooper 2014-03-25 13:20 ` Matt Wilson 2014-03-25 12:19 ` Tim Deegan 1 sibling, 1 reply; 55+ messages in thread From: Andrew Cooper @ 2014-03-25 11:44 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Matt Wilson, Tim Deegan, Matt Rushton, Jan Beulich, xen-devel On 25/03/14 11:22, Matt Wilson wrote: > From: Matt Rushton <mrushton@amazon.com> > > This patch makes the Xen heap allocator use the first half of higher > order chunks instead of the second half when breaking them down for > smaller order allocations. > > Linux currently remaps the memory overlapping PCI space one page at a > time. Before this change this resulted in the mfns being allocated in > reverse order and led to discontiguous dom0 memory. This forced dom0 > to use bounce buffers for doing DMA and resulted in poor performance. > > This change more gracefully handles the dom0 use case and returns > contiguous memory for subsequent allocations. > > Cc: xen-devel@lists.xenproject.org > Cc: Keir Fraser <keir@xen.org> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Tim Deegan <tim@xen.org> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Matt Rushton <mrushton@amazon.com> > Signed-off-by: Matt Wilson <msw@amazon.com> How does dom0 work out that it is safe to join multiple pfns into a dma buffer without the swiotlb? > --- > xen/common/page_alloc.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 601319c..27e7f18 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( > /* We may have to halve the chunk a number of times. */ > while ( j != order ) > { > - PFN_ORDER(pg) = --j; > + struct page_info *pg2; At the very least, Xen style mandates a blank line after this variable declaration. ~Andrew > + pg2 = pg + (1 << --j); > + PFN_ORDER(pg) = j; > page_list_add_tail(pg, &heap(node, zone, j)); > - pg += 1 << j; > } > > ASSERT(avail[node][zone] >= request); ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 11:44 ` Andrew Cooper @ 2014-03-25 13:20 ` Matt Wilson 2014-03-25 20:18 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Matt Wilson @ 2014-03-25 13:20 UTC (permalink / raw) To: Andrew Cooper Cc: Keir Fraser, Matt Wilson, Tim Deegan, Matt Rushton, Jan Beulich, xen-devel On Tue, Mar 25, 2014 at 11:44:19AM +0000, Andrew Cooper wrote: > On 25/03/14 11:22, Matt Wilson wrote: > > From: Matt Rushton <mrushton@amazon.com> > > > > This patch makes the Xen heap allocator use the first half of higher > > order chunks instead of the second half when breaking them down for > > smaller order allocations. > > > > Linux currently remaps the memory overlapping PCI space one page at a > > time. Before this change this resulted in the mfns being allocated in > > reverse order and led to discontiguous dom0 memory. This forced dom0 > > to use bounce buffers for doing DMA and resulted in poor performance. > > > > This change more gracefully handles the dom0 use case and returns > > contiguous memory for subsequent allocations. > > > > Cc: xen-devel@lists.xenproject.org > > Cc: Keir Fraser <keir@xen.org> > > Cc: Jan Beulich <jbeulich@suse.com> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Cc: Tim Deegan <tim@xen.org> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > Signed-off-by: Matt Rushton <mrushton@amazon.com> > > Signed-off-by: Matt Wilson <msw@amazon.com> > > How does dom0 work out that it is safe to join multiple pfns into a dma > buffer without the swiotlb? I'm not familiar enough with how this works to say. Perhaps Matt R. can chime in during his day. My guess is that xen_swiotlb_alloc_coherent() avoids allocating a contiguous region if the pages allocated already happen to be physically contiguous. Konrad, can you enlighten us? The setup code in question that does the remapping one page at a time is in arch/x86/xen/setup.c. > > --- > > xen/common/page_alloc.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > > index 601319c..27e7f18 100644 > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( > > /* We may have to halve the chunk a number of times. */ > > while ( j != order ) > > { > > - PFN_ORDER(pg) = --j; > > + struct page_info *pg2; > > At the very least, Xen style mandates a blank line after this variable > declaration. Ack. > ~Andrew > > > + pg2 = pg + (1 << --j); > > + PFN_ORDER(pg) = j; > > page_list_add_tail(pg, &heap(node, zone, j)); > > - pg += 1 << j; > > } > > > > ASSERT(avail[node][zone] >= request); > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 13:20 ` Matt Wilson @ 2014-03-25 20:18 ` Matthew Rushton 0 siblings, 0 replies; 55+ messages in thread From: Matthew Rushton @ 2014-03-25 20:18 UTC (permalink / raw) To: Matt Wilson, Andrew Cooper Cc: Keir Fraser, Jan Beulich, Tim Deegan, Matt Wilson, xen-devel On 03/25/14 06:20, Matt Wilson wrote: > On Tue, Mar 25, 2014 at 11:44:19AM +0000, Andrew Cooper wrote: >> On 25/03/14 11:22, Matt Wilson wrote: >>> From: Matt Rushton <mrushton@amazon.com> >>> >>> This patch makes the Xen heap allocator use the first half of higher >>> order chunks instead of the second half when breaking them down for >>> smaller order allocations. >>> >>> Linux currently remaps the memory overlapping PCI space one page at a >>> time. Before this change this resulted in the mfns being allocated in >>> reverse order and led to discontiguous dom0 memory. This forced dom0 >>> to use bounce buffers for doing DMA and resulted in poor performance. >>> >>> This change more gracefully handles the dom0 use case and returns >>> contiguous memory for subsequent allocations. >>> >>> Cc: xen-devel@lists.xenproject.org >>> Cc: Keir Fraser <keir@xen.org> >>> Cc: Jan Beulich <jbeulich@suse.com> >>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> Cc: Tim Deegan <tim@xen.org> >>> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >>> Signed-off-by: Matt Rushton <mrushton@amazon.com> >>> Signed-off-by: Matt Wilson <msw@amazon.com> >> How does dom0 work out that it is safe to join multiple pfns into a dma >> buffer without the swiotlb? > I'm not familiar enough with how this works to say. Perhaps Matt R. > can chime in during his day. My guess is that xen_swiotlb_alloc_coherent() > avoids allocating a contiguous region if the pages allocated already > happen to be physically contiguous. > > Konrad, can you enlighten us? The setup code in question that does the > remapping one page at a time is in arch/x86/xen/setup.c. > The swiotlb code will check if the underlying mfns are contiguous and use a bounce buffer if and only if they are not. Everything goes through the swiotlb via the normal Linux dma apis it's just a matter of if it uses a bounce buffer or not. >>> --- >>> xen/common/page_alloc.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >>> index 601319c..27e7f18 100644 >>> --- a/xen/common/page_alloc.c >>> +++ b/xen/common/page_alloc.c >>> @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( >>> /* We may have to halve the chunk a number of times. */ >>> while ( j != order ) >>> { >>> - PFN_ORDER(pg) = --j; >>> + struct page_info *pg2; >> At the very least, Xen style mandates a blank line after this variable >> declaration. > Ack. > >> ~Andrew >> >>> + pg2 = pg + (1 << --j); >>> + PFN_ORDER(pg) = j; >>> page_list_add_tail(pg, &heap(node, zone, j)); >>> - pg += 1 << j; >>> } >>> >>> ASSERT(avail[node][zone] >= request); ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 11:22 [RFC PATCH] page_alloc: use first half of higher order chunks when halving Matt Wilson 2014-03-25 11:44 ` Andrew Cooper @ 2014-03-25 12:19 ` Tim Deegan 2014-03-25 13:27 ` Matt Wilson 1 sibling, 1 reply; 55+ messages in thread From: Tim Deegan @ 2014-03-25 12:19 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Matt Wilson, Andrew Cooper, Matt Rushton, Jan Beulich, xen-devel At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: > From: Matt Rushton <mrushton@amazon.com> > > This patch makes the Xen heap allocator use the first half of higher > order chunks instead of the second half when breaking them down for > smaller order allocations. > > Linux currently remaps the memory overlapping PCI space one page at a > time. Before this change this resulted in the mfns being allocated in > reverse order and led to discontiguous dom0 memory. This forced dom0 > to use bounce buffers for doing DMA and resulted in poor performance. This seems like something better fixed on the dom0 side, by asking explicitly for contiguous memory in cases where it makes a difference. On the Xen side, this change seems harmless, but we might like to keep the explicitly reversed allocation on debug builds, to flush out guests that rely on their memory being contiguous. > This change more gracefully handles the dom0 use case and returns > contiguous memory for subsequent allocations. > > Cc: xen-devel@lists.xenproject.org > Cc: Keir Fraser <keir@xen.org> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Tim Deegan <tim@xen.org> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Matt Rushton <mrushton@amazon.com> > Signed-off-by: Matt Wilson <msw@amazon.com> > --- > xen/common/page_alloc.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 601319c..27e7f18 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( > /* We may have to halve the chunk a number of times. */ > while ( j != order ) > { > - PFN_ORDER(pg) = --j; > + struct page_info *pg2; > + pg2 = pg + (1 << --j); > + PFN_ORDER(pg) = j; > page_list_add_tail(pg, &heap(node, zone, j)); > - pg += 1 << j; AFAICT this uses the low half (pg) for the allocation _and_ puts it on the freelist, and just leaks the high half (pg2). Am I missing something? Tim. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 12:19 ` Tim Deegan @ 2014-03-25 13:27 ` Matt Wilson 2014-03-25 20:09 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Matt Wilson @ 2014-03-25 13:27 UTC (permalink / raw) To: Tim Deegan Cc: Keir Fraser, Matt Wilson, Andrew Cooper, Matt Rushton, Jan Beulich, xen-devel On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: > At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: > > From: Matt Rushton <mrushton@amazon.com> > > > > This patch makes the Xen heap allocator use the first half of higher > > order chunks instead of the second half when breaking them down for > > smaller order allocations. > > > > Linux currently remaps the memory overlapping PCI space one page at a > > time. Before this change this resulted in the mfns being allocated in > > reverse order and led to discontiguous dom0 memory. This forced dom0 > > to use bounce buffers for doing DMA and resulted in poor performance. > > This seems like something better fixed on the dom0 side, by asking > explicitly for contiguous memory in cases where it makes a difference. > On the Xen side, this change seems harmless, but we might like to keep > the explicitly reversed allocation on debug builds, to flush out > guests that rely on their memory being contiguous. Yes, I think that retaining the reverse allocation on debug builds is fine. I'd like Konrad's take on if it's better or possible to fix this on the Linux side. > > This change more gracefully handles the dom0 use case and returns > > contiguous memory for subsequent allocations. > > > > Cc: xen-devel@lists.xenproject.org > > Cc: Keir Fraser <keir@xen.org> > > Cc: Jan Beulich <jbeulich@suse.com> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Cc: Tim Deegan <tim@xen.org> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > Signed-off-by: Matt Rushton <mrushton@amazon.com> > > Signed-off-by: Matt Wilson <msw@amazon.com> > > --- > > xen/common/page_alloc.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > > index 601319c..27e7f18 100644 > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( > > /* We may have to halve the chunk a number of times. */ > > while ( j != order ) > > { > > - PFN_ORDER(pg) = --j; > > + struct page_info *pg2; > > + pg2 = pg + (1 << --j); > > + PFN_ORDER(pg) = j; > > page_list_add_tail(pg, &heap(node, zone, j)); > > - pg += 1 << j; > > AFAICT this uses the low half (pg) for the allocation _and_ puts it on > the freelist, and just leaks the high half (pg2). Am I missing something? Argh, oops. this is totally my fault (not Matt R.'s). I ported the patch out of our development tree incorrectly. The code should have read: while ( j != order ) { struct page_info *pg2; pg2 = pg + (1 << --j); PFN_ORDER(pg2) = j; page_list_add_tail(pg2, &heap(node, zone, j)); } Apologies to Matt for my mangling of his patch (which also already had the correct blank line per Andy's comment). --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 13:27 ` Matt Wilson @ 2014-03-25 20:09 ` Matthew Rushton 2014-03-26 9:55 ` Tim Deegan 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-03-25 20:09 UTC (permalink / raw) To: Matt Wilson, Tim Deegan Cc: Keir Fraser, Jan Beulich, Andrew Cooper, Matt Wilson, xen-devel On 03/25/14 06:27, Matt Wilson wrote: > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: >>> From: Matt Rushton <mrushton@amazon.com> >>> >>> This patch makes the Xen heap allocator use the first half of higher >>> order chunks instead of the second half when breaking them down for >>> smaller order allocations. >>> >>> Linux currently remaps the memory overlapping PCI space one page at a >>> time. Before this change this resulted in the mfns being allocated in >>> reverse order and led to discontiguous dom0 memory. This forced dom0 >>> to use bounce buffers for doing DMA and resulted in poor performance. >> This seems like something better fixed on the dom0 side, by asking >> explicitly for contiguous memory in cases where it makes a difference. >> On the Xen side, this change seems harmless, but we might like to keep >> the explicitly reversed allocation on debug builds, to flush out >> guests that rely on their memory being contiguous. > Yes, I think that retaining the reverse allocation on debug builds is > fine. I'd like Konrad's take on if it's better or possible to fix this > on the Linux side. I considered fixing it in Linux but this was a more straight forward change with no downside as far as I can tell. I see no reason in not fixing it in both places but this at least behaves more reasonably for one potential use case. I'm also interested in other opinions. >>> This change more gracefully handles the dom0 use case and returns >>> contiguous memory for subsequent allocations. >>> >>> Cc: xen-devel@lists.xenproject.org >>> Cc: Keir Fraser <keir@xen.org> >>> Cc: Jan Beulich <jbeulich@suse.com> >>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> Cc: Tim Deegan <tim@xen.org> >>> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >>> Signed-off-by: Matt Rushton <mrushton@amazon.com> >>> Signed-off-by: Matt Wilson <msw@amazon.com> >>> --- >>> xen/common/page_alloc.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >>> index 601319c..27e7f18 100644 >>> --- a/xen/common/page_alloc.c >>> +++ b/xen/common/page_alloc.c >>> @@ -677,9 +677,10 @@ static struct page_info *alloc_heap_pages( >>> /* We may have to halve the chunk a number of times. */ >>> while ( j != order ) >>> { >>> - PFN_ORDER(pg) = --j; >>> + struct page_info *pg2; >>> + pg2 = pg + (1 << --j); >>> + PFN_ORDER(pg) = j; >>> page_list_add_tail(pg, &heap(node, zone, j)); >>> - pg += 1 << j; >> AFAICT this uses the low half (pg) for the allocation _and_ puts it on >> the freelist, and just leaks the high half (pg2). Am I missing something? > Argh, oops. this is totally my fault (not Matt R.'s). I ported the > patch out of our development tree incorrectly. The code should have > read: > > while ( j != order ) > { > struct page_info *pg2; > > pg2 = pg + (1 << --j); > PFN_ORDER(pg2) = j; > page_list_add_tail(pg2, &heap(node, zone, j)); > } > > Apologies to Matt for my mangling of his patch (which also already had > the correct blank line per Andy's comment). > > --msw No worries I was about to correct you:) ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-25 20:09 ` Matthew Rushton @ 2014-03-26 9:55 ` Tim Deegan 2014-03-26 10:17 ` Matt Wilson 0 siblings, 1 reply; 55+ messages in thread From: Tim Deegan @ 2014-03-26 9:55 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Jan Beulich, Andrew Cooper, xen-devel Hi, At 13:09 -0700 on 25 Mar (1395749353), Matthew Rushton wrote: > On 03/25/14 06:27, Matt Wilson wrote: > > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: > >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: > >>> From: Matt Rushton <mrushton@amazon.com> > >>> > >>> This patch makes the Xen heap allocator use the first half of higher > >>> order chunks instead of the second half when breaking them down for > >>> smaller order allocations. > >>> > >>> Linux currently remaps the memory overlapping PCI space one page at a > >>> time. Before this change this resulted in the mfns being allocated in > >>> reverse order and led to discontiguous dom0 memory. This forced dom0 > >>> to use bounce buffers for doing DMA and resulted in poor performance. > >> This seems like something better fixed on the dom0 side, by asking > >> explicitly for contiguous memory in cases where it makes a difference. > >> On the Xen side, this change seems harmless, but we might like to keep > >> the explicitly reversed allocation on debug builds, to flush out > >> guests that rely on their memory being contiguous. > > Yes, I think that retaining the reverse allocation on debug builds is > > fine. I'd like Konrad's take on if it's better or possible to fix this > > on the Linux side. > > I considered fixing it in Linux but this was a more straight forward > change with no downside as far as I can tell. I see no reason in not > fixing it in both places but this at least behaves more reasonably for > one potential use case. I'm also interested in other opinions. Well, I'm happy enough with changing Xen (though it's common code so you'll need Keir's ack anyway rather than mine), since as you say it happens to make one use case a bit better and is otherwise harmless. But that comes with a stinking great warning: - This is not 'fixing' anything in Xen because Xen is doing exactly what dom0 asks for in the current code; and conversely - dom0 (and other guests) _must_not_ rely on it, whether for performance or correctness. Xen might change its page allocator at some point in the future, for any reason, and if linux perf starts sucking when that happens, that's (still) a linux bug. Cheers, Tim. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 9:55 ` Tim Deegan @ 2014-03-26 10:17 ` Matt Wilson 2014-03-26 10:44 ` David Vrabel 2014-03-26 15:08 ` Konrad Rzeszutek Wilk 0 siblings, 2 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-26 10:17 UTC (permalink / raw) To: Tim Deegan Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Jan Beulich, xen-devel On Wed, Mar 26, 2014 at 10:55:33AM +0100, Tim Deegan wrote: > Hi, > > At 13:09 -0700 on 25 Mar (1395749353), Matthew Rushton wrote: > > On 03/25/14 06:27, Matt Wilson wrote: > > > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: > > >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: > > >>> From: Matt Rushton <mrushton@amazon.com> > > >>> > > >>> This patch makes the Xen heap allocator use the first half of higher > > >>> order chunks instead of the second half when breaking them down for > > >>> smaller order allocations. > > >>> > > >>> Linux currently remaps the memory overlapping PCI space one page at a > > >>> time. Before this change this resulted in the mfns being allocated in > > >>> reverse order and led to discontiguous dom0 memory. This forced dom0 > > >>> to use bounce buffers for doing DMA and resulted in poor performance. > > >> This seems like something better fixed on the dom0 side, by asking > > >> explicitly for contiguous memory in cases where it makes a difference. > > >> On the Xen side, this change seems harmless, but we might like to keep > > >> the explicitly reversed allocation on debug builds, to flush out > > >> guests that rely on their memory being contiguous. > > > Yes, I think that retaining the reverse allocation on debug builds is > > > fine. I'd like Konrad's take on if it's better or possible to fix this > > > on the Linux side. > > > > I considered fixing it in Linux but this was a more straight forward > > change with no downside as far as I can tell. I see no reason in not > > fixing it in both places but this at least behaves more reasonably for > > one potential use case. I'm also interested in other opinions. > > Well, I'm happy enough with changing Xen (though it's common code so > you'll need Keir's ack anyway rather than mine), since as you say it > happens to make one use case a bit better and is otherwise harmless. > But that comes with a stinking great warning: Anyone can Ack or Nack, but I wouldn't want to move forward on a change like this without Keir's Ack. :-) > - This is not 'fixing' anything in Xen because Xen is doing exactly > what dom0 asks for in the current code; and conversely > > - dom0 (and other guests) _must_not_ rely on it, whether for > performance or correctness. Xen might change its page allocator at > some point in the future, for any reason, and if linux perf starts > sucking when that happens, that's (still) a linux bug. I agree with both of these. This was just the "least change" patch to a particular problem we observed. Konrad, what's the possibility of fixing this in Linux Xen PV setup code? I think it'd be a matter batching up pages and doing larger order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), falling back to smaller pages if allocations fail due to fragmentation, etc. --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 10:17 ` Matt Wilson @ 2014-03-26 10:44 ` David Vrabel 2014-03-26 10:48 ` Matt Wilson 2014-03-26 15:08 ` Konrad Rzeszutek Wilk 1 sibling, 1 reply; 55+ messages in thread From: David Vrabel @ 2014-03-26 10:44 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel, Malcolm Crossley On 26/03/14 10:17, Matt Wilson wrote: > > Konrad, what's the possibility of fixing this in Linux Xen PV setup > code? I think it'd be a matter batching up pages and doing larger > order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > falling back to smaller pages if allocations fail due to > fragmentation, etc. We plan to fix problems caused by non-machine-contiguous memory by setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This would avoid using the swiotlb always[1], regardless of the machine layout of dom0 or the driver domain. I think I would prefer this approach rather than making xen/setup.c even more horribly complicated. Malcolm has a working prototype of this already. David [1] Unless you have a non-64 bit DMA capable device, but we don't care about performance with these. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 10:44 ` David Vrabel @ 2014-03-26 10:48 ` Matt Wilson 2014-03-26 11:13 ` Ian Campbell 2014-03-26 12:43 ` David Vrabel 0 siblings, 2 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-26 10:48 UTC (permalink / raw) To: David Vrabel Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel, Malcolm Crossley On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: > On 26/03/14 10:17, Matt Wilson wrote: > > > > Konrad, what's the possibility of fixing this in Linux Xen PV setup > > code? I think it'd be a matter batching up pages and doing larger > > order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > > falling back to smaller pages if allocations fail due to > > fragmentation, etc. > > We plan to fix problems caused by non-machine-contiguous memory by > setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This > would avoid using the swiotlb always[1], regardless of the machine > layout of dom0 or the driver domain. > > I think I would prefer this approach rather than making xen/setup.c even > more horribly complicated. I imagine that some users will not want to run dom0 under an IOMMU. If changing Linux Xen PV setup is (rightly) objectionable due to complexity, perhaps this small change to the hypervisor is a better short-term fix. > Malcolm has a working prototype of this already. Is it ready for a RFC posting? > David > > [1] Unless you have a non-64 bit DMA capable device, but we don't care > about performance with these. --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 10:48 ` Matt Wilson @ 2014-03-26 11:13 ` Ian Campbell 2014-03-26 11:41 ` Matt Wilson 2014-03-26 12:43 ` David Vrabel 1 sibling, 1 reply; 55+ messages in thread From: Ian Campbell @ 2014-03-26 11:13 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Jan Beulich, Matthew Rushton, Andrew Cooper, Tim Deegan, David Vrabel, Matt Wilson, xen-devel, Malcolm Crossley On Wed, 2014-03-26 at 12:48 +0200, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: > > On 26/03/14 10:17, Matt Wilson wrote: > > > > > > Konrad, what's the possibility of fixing this in Linux Xen PV setup > > > code? I think it'd be a matter batching up pages and doing larger > > > order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > > > falling back to smaller pages if allocations fail due to > > > fragmentation, etc. > > > > We plan to fix problems caused by non-machine-contiguous memory by > > setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This > > would avoid using the swiotlb always[1], regardless of the machine > > layout of dom0 or the driver domain. > > > > I think I would prefer this approach rather than making xen/setup.c even > > more horribly complicated. > > I imagine that some users will not want to run dom0 under an IOMMU. Then they have chosen that (for whatever reason) over performance, right? ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 11:13 ` Ian Campbell @ 2014-03-26 11:41 ` Matt Wilson 2014-03-26 11:45 ` Andrew Cooper 0 siblings, 1 reply; 55+ messages in thread From: Matt Wilson @ 2014-03-26 11:41 UTC (permalink / raw) To: Ian Campbell Cc: Keir Fraser, Jan Beulich, Matthew Rushton, Andrew Cooper, Tim Deegan, David Vrabel, Matt Wilson, xen-devel, Malcolm Crossley On Wed, Mar 26, 2014 at 11:13:49AM +0000, Ian Campbell wrote: > On Wed, 2014-03-26 at 12:48 +0200, Matt Wilson wrote: > > On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: > > > On 26/03/14 10:17, Matt Wilson wrote: > > > > > > > > Konrad, what's the possibility of fixing this in Linux Xen PV setup > > > > code? I think it'd be a matter batching up pages and doing larger > > > > order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > > > > falling back to smaller pages if allocations fail due to > > > > fragmentation, etc. > > > > > > We plan to fix problems caused by non-machine-contiguous memory by > > > setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This > > > would avoid using the swiotlb always[1], regardless of the machine > > > layout of dom0 or the driver domain. > > > > > > I think I would prefer this approach rather than making xen/setup.c even > > > more horribly complicated. > > > > I imagine that some users will not want to run dom0 under an IOMMU. > > Then they have chosen that (for whatever reason) over performance, > right? IOMMU is not free, so I imagine that some users would actually be choosing to avoid it specifically for performance reasons. --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 11:41 ` Matt Wilson @ 2014-03-26 11:45 ` Andrew Cooper 2014-03-26 11:50 ` Matt Wilson 0 siblings, 1 reply; 55+ messages in thread From: Andrew Cooper @ 2014-03-26 11:45 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Ian Campbell, Matthew Rushton, Tim Deegan, Jan Beulich, David Vrabel, Matt Wilson, xen-devel, Malcolm Crossley On 26/03/14 11:41, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 11:13:49AM +0000, Ian Campbell wrote: >> On Wed, 2014-03-26 at 12:48 +0200, Matt Wilson wrote: >>> On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: >>>> On 26/03/14 10:17, Matt Wilson wrote: >>>>> Konrad, what's the possibility of fixing this in Linux Xen PV setup >>>>> code? I think it'd be a matter batching up pages and doing larger >>>>> order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), >>>>> falling back to smaller pages if allocations fail due to >>>>> fragmentation, etc. >>>> We plan to fix problems caused by non-machine-contiguous memory by >>>> setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This >>>> would avoid using the swiotlb always[1], regardless of the machine >>>> layout of dom0 or the driver domain. >>>> >>>> I think I would prefer this approach rather than making xen/setup.c even >>>> more horribly complicated. >>> I imagine that some users will not want to run dom0 under an IOMMU. >> Then they have chosen that (for whatever reason) over performance, >> right? > IOMMU is not free, so I imagine that some users would actually be > choosing to avoid it specifically for performance reasons. > > --msw True, but if people are actually looking for performance, they will be turning on features like GRO and more generically scatter/gather which results in drivers using single page DMA mappings at a time, which completely bypass the bouncing in the swiotlb. ~Andrew ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 11:45 ` Andrew Cooper @ 2014-03-26 11:50 ` Matt Wilson 0 siblings, 0 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-26 11:50 UTC (permalink / raw) To: Andrew Cooper Cc: Keir Fraser, Ian Campbell, Matthew Rushton, Tim Deegan, Jan Beulich, David Vrabel, Matt Wilson, xen-devel, Malcolm Crossley On Wed, Mar 26, 2014 at 11:45:50AM +0000, Andrew Cooper wrote: > On 26/03/14 11:41, Matt Wilson wrote: > > On Wed, Mar 26, 2014 at 11:13:49AM +0000, Ian Campbell wrote: > >> On Wed, 2014-03-26 at 12:48 +0200, Matt Wilson wrote: > >>> I imagine that some users will not want to run dom0 under an IOMMU. > >> > >> Then they have chosen that (for whatever reason) over performance, > >> right? > > > > IOMMU is not free, so I imagine that some users would actually be > > choosing to avoid it specifically for performance reasons. > > > > --msw > > True, but if people are actually looking for performance, they will be > turning on features like GRO and more generically scatter/gather which > results in drivers using single page DMA mappings at a time, which > completely bypass the bouncing in the swiotlb. Alas, GRO isn't without its own problems (at least historically). --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 10:48 ` Matt Wilson 2014-03-26 11:13 ` Ian Campbell @ 2014-03-26 12:43 ` David Vrabel 2014-03-26 12:48 ` Matt Wilson 1 sibling, 1 reply; 55+ messages in thread From: David Vrabel @ 2014-03-26 12:43 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Jan Beulich, Matthew Rushton, Andrew Cooper, Tim Deegan, David Vrabel, Matt Wilson, xen-devel, Malcolm Crossley On 26/03/14 10:48, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: >> On 26/03/14 10:17, Matt Wilson wrote: >>> >>> Konrad, what's the possibility of fixing this in Linux Xen PV setup >>> code? I think it'd be a matter batching up pages and doing larger >>> order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), >>> falling back to smaller pages if allocations fail due to >>> fragmentation, etc. >> >> We plan to fix problems caused by non-machine-contiguous memory by >> setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This >> would avoid using the swiotlb always[1], regardless of the machine >> layout of dom0 or the driver domain. >> >> I think I would prefer this approach rather than making xen/setup.c even >> more horribly complicated. > > I imagine that some users will not want to run dom0 under an IOMMU. If > changing Linux Xen PV setup is (rightly) objectionable due to > complexity, perhaps this small change to the hypervisor is a better > short-term fix. Users who are not using the IOMMU for performance reasons but are complaining about swiotlb costs? I'm not sure that's an interesting set of users... I'm not ruling out any Linux-side memory setup changes. I just don't think they're a complete solution. >> Malcolm has a working prototype of this already. > > Is it ready for a RFC posting? Not sure. Malcolm is current OoO. David ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 12:43 ` David Vrabel @ 2014-03-26 12:48 ` Matt Wilson 0 siblings, 0 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-26 12:48 UTC (permalink / raw) To: David Vrabel Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel, Malcolm Crossley On Wed, Mar 26, 2014 at 12:43:25PM +0000, David Vrabel wrote: > On 26/03/14 10:48, Matt Wilson wrote: > > On Wed, Mar 26, 2014 at 10:44:18AM +0000, David Vrabel wrote: > >> On 26/03/14 10:17, Matt Wilson wrote: > >>> > >>> Konrad, what's the possibility of fixing this in Linux Xen PV setup > >>> code? I think it'd be a matter batching up pages and doing larger > >>> order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > >>> falling back to smaller pages if allocations fail due to > >>> fragmentation, etc. > >> > >> We plan to fix problems caused by non-machine-contiguous memory by > >> setting up the IOMMU to have 1:1 bus to pseudo-physical mappings. This > >> would avoid using the swiotlb always[1], regardless of the machine > >> layout of dom0 or the driver domain. > >> > >> I think I would prefer this approach rather than making xen/setup.c even > >> more horribly complicated. > > > > I imagine that some users will not want to run dom0 under an IOMMU. If > > changing Linux Xen PV setup is (rightly) objectionable due to > > complexity, perhaps this small change to the hypervisor is a better > > short-term fix. > > Users who are not using the IOMMU for performance reasons but are > complaining about swiotlb costs? I'm not sure that's an interesting set > of users... You have to admit that a configuration that avoids both IOMMU and bouncing in swiotlb will be the best performance in many scenarios, don't you think? --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 10:17 ` Matt Wilson 2014-03-26 10:44 ` David Vrabel @ 2014-03-26 15:08 ` Konrad Rzeszutek Wilk 2014-03-26 15:15 ` Matt Wilson 1 sibling, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-26 15:08 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel On Wed, Mar 26, 2014 at 12:17:53PM +0200, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 10:55:33AM +0100, Tim Deegan wrote: > > Hi, > > > > At 13:09 -0700 on 25 Mar (1395749353), Matthew Rushton wrote: > > > On 03/25/14 06:27, Matt Wilson wrote: > > > > On Tue, Mar 25, 2014 at 01:19:22PM +0100, Tim Deegan wrote: > > > >> At 13:22 +0200 on 25 Mar (1395750124), Matt Wilson wrote: > > > >>> From: Matt Rushton <mrushton@amazon.com> > > > >>> > > > >>> This patch makes the Xen heap allocator use the first half of higher > > > >>> order chunks instead of the second half when breaking them down for > > > >>> smaller order allocations. > > > >>> > > > >>> Linux currently remaps the memory overlapping PCI space one page at a > > > >>> time. Before this change this resulted in the mfns being allocated in > > > >>> reverse order and led to discontiguous dom0 memory. This forced dom0 > > > >>> to use bounce buffers for doing DMA and resulted in poor performance. > > > >> This seems like something better fixed on the dom0 side, by asking > > > >> explicitly for contiguous memory in cases where it makes a difference. > > > >> On the Xen side, this change seems harmless, but we might like to keep > > > >> the explicitly reversed allocation on debug builds, to flush out > > > >> guests that rely on their memory being contiguous. > > > > Yes, I think that retaining the reverse allocation on debug builds is > > > > fine. I'd like Konrad's take on if it's better or possible to fix this > > > > on the Linux side. > > > > > > I considered fixing it in Linux but this was a more straight forward > > > change with no downside as far as I can tell. I see no reason in not > > > fixing it in both places but this at least behaves more reasonably for > > > one potential use case. I'm also interested in other opinions. > > > > Well, I'm happy enough with changing Xen (though it's common code so > > you'll need Keir's ack anyway rather than mine), since as you say it > > happens to make one use case a bit better and is otherwise harmless. > > But that comes with a stinking great warning: > > Anyone can Ack or Nack, but I wouldn't want to move forward on a > change like this without Keir's Ack. :-) > > > - This is not 'fixing' anything in Xen because Xen is doing exactly > > what dom0 asks for in the current code; and conversely > > > > - dom0 (and other guests) _must_not_ rely on it, whether for > > performance or correctness. Xen might change its page allocator at > > some point in the future, for any reason, and if linux perf starts > > sucking when that happens, that's (still) a linux bug. > > I agree with both of these. This was just the "least change" patch to > a particular problem we observed. > > Konrad, what's the possibility of fixing this in Linux Xen PV setup > code? I think it'd be a matter batching up pages and doing larger > order allocations in linux/arch/x86/xen/setup.c:xen_do_chunk(), > falling back to smaller pages if allocations fail due to > fragmentation, etc. Could you elaborate a bit more on the use-case please? My understanding is that most drivers use a scatter gather list - in which case it does not matter if the underlaying MFNs in the PFNs spare are not contingous. But I presume the issue you are hitting is with drivers doing dma_map_page and the page is not 4KB but rather large (compound page). Is that the problem you have observed? Thanks. > > --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 15:08 ` Konrad Rzeszutek Wilk @ 2014-03-26 15:15 ` Matt Wilson 2014-03-26 15:59 ` Matthew Rushton 2014-03-26 16:34 ` Konrad Rzeszutek Wilk 0 siblings, 2 replies; 55+ messages in thread From: Matt Wilson @ 2014-03-26 15:15 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > > Could you elaborate a bit more on the use-case please? > My understanding is that most drivers use a scatter gather list - in which > case it does not matter if the underlaying MFNs in the PFNs spare are > not contingous. > > But I presume the issue you are hitting is with drivers doing dma_map_page > and the page is not 4KB but rather large (compound page). Is that the > problem you have observed? Drivers are using very large size arguments to dma_alloc_coherent() for things like RX and TX descriptor rings. --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 15:15 ` Matt Wilson @ 2014-03-26 15:59 ` Matthew Rushton 2014-03-26 16:36 ` Konrad Rzeszutek Wilk 2014-03-26 16:34 ` Konrad Rzeszutek Wilk 1 sibling, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-03-26 15:59 UTC (permalink / raw) To: Matt Wilson, Konrad Rzeszutek Wilk Cc: Keir Fraser, Jan Beulich, Andrew Cooper, Tim Deegan, Matt Wilson, xen-devel On 03/26/14 08:15, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: >> Could you elaborate a bit more on the use-case please? >> My understanding is that most drivers use a scatter gather list - in which >> case it does not matter if the underlaying MFNs in the PFNs spare are >> not contingous. >> >> But I presume the issue you are hitting is with drivers doing dma_map_page >> and the page is not 4KB but rather large (compound page). Is that the >> problem you have observed? > Drivers are using very large size arguments to dma_alloc_coherent() > for things like RX and TX descriptor rings. > > --msw It's the dma streaming api I've noticed the problem with, so dma_map_single(). Applicable swiotlb code would be xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 15:59 ` Matthew Rushton @ 2014-03-26 16:36 ` Konrad Rzeszutek Wilk 2014-03-26 17:47 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-26 16:36 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > On 03/26/14 08:15, Matt Wilson wrote: > >On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>Could you elaborate a bit more on the use-case please? > >>My understanding is that most drivers use a scatter gather list - in which > >>case it does not matter if the underlaying MFNs in the PFNs spare are > >>not contingous. > >> > >>But I presume the issue you are hitting is with drivers doing dma_map_page > >>and the page is not 4KB but rather large (compound page). Is that the > >>problem you have observed? > >Drivers are using very large size arguments to dma_alloc_coherent() > >for things like RX and TX descriptor rings. Large size like larger than 512kB? That would also cause problems on baremetal then when swiotlb is activated I believe. > > > >--msw > > It's the dma streaming api I've noticed the problem with, so > dma_map_single(). Applicable swiotlb code would be > xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 16:36 ` Konrad Rzeszutek Wilk @ 2014-03-26 17:47 ` Matthew Rushton 2014-03-26 17:56 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-03-26 17:47 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: >> On 03/26/14 08:15, Matt Wilson wrote: >>> On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: >>>> Could you elaborate a bit more on the use-case please? >>>> My understanding is that most drivers use a scatter gather list - in which >>>> case it does not matter if the underlaying MFNs in the PFNs spare are >>>> not contingous. >>>> >>>> But I presume the issue you are hitting is with drivers doing dma_map_page >>>> and the page is not 4KB but rather large (compound page). Is that the >>>> problem you have observed? >>> Drivers are using very large size arguments to dma_alloc_coherent() >>> for things like RX and TX descriptor rings. > Large size like larger than 512kB? That would also cause problems > on baremetal then when swiotlb is activated I believe. I was looking at network IO performance so the buffers would not have been that large. I think large in this context is relative to the 4k page size and the odds of the buffer spanning a page boundary. For context I saw ~5-10% performance increase with guest network throughput by avoiding bounce buffers and also saw dom0 tcp streaming performance go from ~6Gb/s to over 9Gb/s on my test setup with a 10Gb NIC. > >>> --msw >> It's the dma streaming api I've noticed the problem with, so >> dma_map_single(). Applicable swiotlb code would be >> xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes >> for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 17:47 ` Matthew Rushton @ 2014-03-26 17:56 ` Konrad Rzeszutek Wilk 2014-03-26 22:15 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-26 17:56 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: > On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > >On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > >>On 03/26/14 08:15, Matt Wilson wrote: > >>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>>>Could you elaborate a bit more on the use-case please? > >>>>My understanding is that most drivers use a scatter gather list - in which > >>>>case it does not matter if the underlaying MFNs in the PFNs spare are > >>>>not contingous. > >>>> > >>>>But I presume the issue you are hitting is with drivers doing dma_map_page > >>>>and the page is not 4KB but rather large (compound page). Is that the > >>>>problem you have observed? > >>>Drivers are using very large size arguments to dma_alloc_coherent() > >>>for things like RX and TX descriptor rings. > >Large size like larger than 512kB? That would also cause problems > >on baremetal then when swiotlb is activated I believe. > > I was looking at network IO performance so the buffers would not > have been that large. I think large in this context is relative to > the 4k page size and the odds of the buffer spanning a page > boundary. For context I saw ~5-10% performance increase with guest > network throughput by avoiding bounce buffers and also saw dom0 tcp > streaming performance go from ~6Gb/s to over 9Gb/s on my test setup > with a 10Gb NIC. OK, but that would not be the dma_alloc_coherent ones then? That sounds more like the generic TCP mechanism allocated 64KB pages instead of 4KB and used those. Did you try looking at this hack that Ian proposed a long time ago to verify that it is said problem? https://lkml.org/lkml/2013/9/4/540 > > > > >>>--msw > >>It's the dma streaming api I've noticed the problem with, so > >>dma_map_single(). Applicable swiotlb code would be > >>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > >>for larger buffers it can cause bouncing. > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 17:56 ` Konrad Rzeszutek Wilk @ 2014-03-26 22:15 ` Matthew Rushton 2014-03-28 17:02 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-03-26 22:15 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: > On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: >> On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: >>> On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: >>>> On 03/26/14 08:15, Matt Wilson wrote: >>>>> On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: >>>>>> Could you elaborate a bit more on the use-case please? >>>>>> My understanding is that most drivers use a scatter gather list - in which >>>>>> case it does not matter if the underlaying MFNs in the PFNs spare are >>>>>> not contingous. >>>>>> >>>>>> But I presume the issue you are hitting is with drivers doing dma_map_page >>>>>> and the page is not 4KB but rather large (compound page). Is that the >>>>>> problem you have observed? >>>>> Drivers are using very large size arguments to dma_alloc_coherent() >>>>> for things like RX and TX descriptor rings. >>> Large size like larger than 512kB? That would also cause problems >>> on baremetal then when swiotlb is activated I believe. >> I was looking at network IO performance so the buffers would not >> have been that large. I think large in this context is relative to >> the 4k page size and the odds of the buffer spanning a page >> boundary. For context I saw ~5-10% performance increase with guest >> network throughput by avoiding bounce buffers and also saw dom0 tcp >> streaming performance go from ~6Gb/s to over 9Gb/s on my test setup >> with a 10Gb NIC. > OK, but that would not be the dma_alloc_coherent ones then? That sounds > more like the generic TCP mechanism allocated 64KB pages instead of 4KB > and used those. > > Did you try looking at this hack that Ian proposed a long time ago > to verify that it is said problem? > > https://lkml.org/lkml/2013/9/4/540 > Yes I had seen that and intially had the same reaction but the change was relatively recent and not relevant. I *think* all the coherent allocations are ok since the swiotlb makes them contiguous. The problem comes with the use of the streaming api. As one example with jumbo frames enabled a driver might use larger rx buffers which triggers the problem. I think the right thing to do is to make the dma streaming api work better with larger buffers on dom0. That way it works across all drivers and device types regardless of how they were designed. >>>>> --msw >>>> It's the dma streaming api I've noticed the problem with, so >>>> dma_map_single(). Applicable swiotlb code would be >>>> xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes >>>> for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 22:15 ` Matthew Rushton @ 2014-03-28 17:02 ` Konrad Rzeszutek Wilk 2014-03-28 22:06 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-28 17:02 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: > On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: > >On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: > >>On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > >>>On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > >>>>On 03/26/14 08:15, Matt Wilson wrote: > >>>>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>>>>>Could you elaborate a bit more on the use-case please? > >>>>>>My understanding is that most drivers use a scatter gather list - in which > >>>>>>case it does not matter if the underlaying MFNs in the PFNs spare are > >>>>>>not contingous. > >>>>>> > >>>>>>But I presume the issue you are hitting is with drivers doing dma_map_page > >>>>>>and the page is not 4KB but rather large (compound page). Is that the > >>>>>>problem you have observed? > >>>>>Drivers are using very large size arguments to dma_alloc_coherent() > >>>>>for things like RX and TX descriptor rings. > >>>Large size like larger than 512kB? That would also cause problems > >>>on baremetal then when swiotlb is activated I believe. > >>I was looking at network IO performance so the buffers would not > >>have been that large. I think large in this context is relative to > >>the 4k page size and the odds of the buffer spanning a page > >>boundary. For context I saw ~5-10% performance increase with guest > >>network throughput by avoiding bounce buffers and also saw dom0 tcp > >>streaming performance go from ~6Gb/s to over 9Gb/s on my test setup > >>with a 10Gb NIC. > >OK, but that would not be the dma_alloc_coherent ones then? That sounds > >more like the generic TCP mechanism allocated 64KB pages instead of 4KB > >and used those. > > > >Did you try looking at this hack that Ian proposed a long time ago > >to verify that it is said problem? > > > >https://lkml.org/lkml/2013/9/4/540 > > > > Yes I had seen that and intially had the same reaction but the > change was relatively recent and not relevant. I *think* all the > coherent allocations are ok since the swiotlb makes them contiguous. > The problem comes with the use of the streaming api. As one example > with jumbo frames enabled a driver might use larger rx buffers which > triggers the problem. > > I think the right thing to do is to make the dma streaming api work > better with larger buffers on dom0. That way it works across all OK. > drivers and device types regardless of how they were designed. Can you point me to an example of the DMA streaming API? I am not sure if you mean 'streaming API' as scatter gather operations using DMA API? Is there a particular easy way for me to reproduce this. I have to say I hadn't enabled Jumbo frame on my box since I am not even sure if the switch I have can do it. Is there a idiots-punch-list of how to reproduce this? Thanks! > > >>>>>--msw > >>>>It's the dma streaming api I've noticed the problem with, so > >>>>dma_map_single(). Applicable swiotlb code would be > >>>>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > >>>>for larger buffers it can cause bouncing. > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-28 17:02 ` Konrad Rzeszutek Wilk @ 2014-03-28 22:06 ` Matthew Rushton 2014-03-31 14:15 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-03-28 22:06 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On 03/28/14 10:02, Konrad Rzeszutek Wilk wrote: > On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: >> On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: >>> On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: >>>> On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: >>>>> On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: >>>>>> On 03/26/14 08:15, Matt Wilson wrote: >>>>>>> On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: >>>>>>>> Could you elaborate a bit more on the use-case please? >>>>>>>> My understanding is that most drivers use a scatter gather list - in which >>>>>>>> case it does not matter if the underlaying MFNs in the PFNs spare are >>>>>>>> not contingous. >>>>>>>> >>>>>>>> But I presume the issue you are hitting is with drivers doing dma_map_page >>>>>>>> and the page is not 4KB but rather large (compound page). Is that the >>>>>>>> problem you have observed? >>>>>>> Drivers are using very large size arguments to dma_alloc_coherent() >>>>>>> for things like RX and TX descriptor rings. >>>>> Large size like larger than 512kB? That would also cause problems >>>>> on baremetal then when swiotlb is activated I believe. >>>> I was looking at network IO performance so the buffers would not >>>> have been that large. I think large in this context is relative to >>>> the 4k page size and the odds of the buffer spanning a page >>>> boundary. For context I saw ~5-10% performance increase with guest >>>> network throughput by avoiding bounce buffers and also saw dom0 tcp >>>> streaming performance go from ~6Gb/s to over 9Gb/s on my test setup >>>> with a 10Gb NIC. >>> OK, but that would not be the dma_alloc_coherent ones then? That sounds >>> more like the generic TCP mechanism allocated 64KB pages instead of 4KB >>> and used those. >>> >>> Did you try looking at this hack that Ian proposed a long time ago >>> to verify that it is said problem? >>> >>> https://lkml.org/lkml/2013/9/4/540 >>> >> Yes I had seen that and intially had the same reaction but the >> change was relatively recent and not relevant. I *think* all the >> coherent allocations are ok since the swiotlb makes them contiguous. >> The problem comes with the use of the streaming api. As one example >> with jumbo frames enabled a driver might use larger rx buffers which >> triggers the problem. >> >> I think the right thing to do is to make the dma streaming api work >> better with larger buffers on dom0. That way it works across all > OK. >> drivers and device types regardless of how they were designed. > Can you point me to an example of the DMA streaming API? > > I am not sure if you mean 'streaming API' as scatter gather operations > using DMA API? > > Is there a particular easy way for me to reproduce this. I have > to say I hadn't enabled Jumbo frame on my box since I am not even > sure if the switch I have can do it. Is there a idiots-punch-list > of how to reproduce this? > > Thanks! By streaming API I'm just referring to drivers that use dma_map_single/dma_unmap_single on every buffer instead of using coherent allocations. So not related to sg in my case. If you want an example of this you can look at the bnx2x Broadcom driver. To reproduce this at a minimum you'll need to have: 1) Enough dom0 memory so it overlaps with PCI space and gets remapped by Linux at boot 2) A driver that uses dma_map_single/dma_unmap_single 3) Large enough buffers so that they span page boundaries Things that may help with 3 are enabling jumbos and various offload settings in either guests or dom0. >>>>>>> --msw >>>>>> It's the dma streaming api I've noticed the problem with, so >>>>>> dma_map_single(). Applicable swiotlb code would be >>>>>> xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes >>>>>> for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-28 22:06 ` Matthew Rushton @ 2014-03-31 14:15 ` Konrad Rzeszutek Wilk 2014-04-01 3:25 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-31 14:15 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On Fri, Mar 28, 2014 at 03:06:23PM -0700, Matthew Rushton wrote: > On 03/28/14 10:02, Konrad Rzeszutek Wilk wrote: > >On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: > >>On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: > >>>On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: > >>>>On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > >>>>>On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > >>>>>>On 03/26/14 08:15, Matt Wilson wrote: > >>>>>>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>>>>>>>Could you elaborate a bit more on the use-case please? > >>>>>>>>My understanding is that most drivers use a scatter gather list - in which > >>>>>>>>case it does not matter if the underlaying MFNs in the PFNs spare are > >>>>>>>>not contingous. > >>>>>>>> > >>>>>>>>But I presume the issue you are hitting is with drivers doing dma_map_page > >>>>>>>>and the page is not 4KB but rather large (compound page). Is that the > >>>>>>>>problem you have observed? > >>>>>>>Drivers are using very large size arguments to dma_alloc_coherent() > >>>>>>>for things like RX and TX descriptor rings. > >>>>>Large size like larger than 512kB? That would also cause problems > >>>>>on baremetal then when swiotlb is activated I believe. > >>>>I was looking at network IO performance so the buffers would not > >>>>have been that large. I think large in this context is relative to > >>>>the 4k page size and the odds of the buffer spanning a page > >>>>boundary. For context I saw ~5-10% performance increase with guest > >>>>network throughput by avoiding bounce buffers and also saw dom0 tcp > >>>>streaming performance go from ~6Gb/s to over 9Gb/s on my test setup > >>>>with a 10Gb NIC. > >>>OK, but that would not be the dma_alloc_coherent ones then? That sounds > >>>more like the generic TCP mechanism allocated 64KB pages instead of 4KB > >>>and used those. > >>> > >>>Did you try looking at this hack that Ian proposed a long time ago > >>>to verify that it is said problem? > >>> > >>>https://lkml.org/lkml/2013/9/4/540 > >>> > >>Yes I had seen that and intially had the same reaction but the > >>change was relatively recent and not relevant. I *think* all the > >>coherent allocations are ok since the swiotlb makes them contiguous. > >>The problem comes with the use of the streaming api. As one example > >>with jumbo frames enabled a driver might use larger rx buffers which > >>triggers the problem. > >> > >>I think the right thing to do is to make the dma streaming api work > >>better with larger buffers on dom0. That way it works across all > >OK. > >>drivers and device types regardless of how they were designed. > >Can you point me to an example of the DMA streaming API? > > > >I am not sure if you mean 'streaming API' as scatter gather operations > >using DMA API? > > > >Is there a particular easy way for me to reproduce this. I have > >to say I hadn't enabled Jumbo frame on my box since I am not even > >sure if the switch I have can do it. Is there a idiots-punch-list > >of how to reproduce this? > > > >Thanks! > > By streaming API I'm just referring to drivers that use > dma_map_single/dma_unmap_single on every buffer instead of using > coherent allocations. So not related to sg in my case. If you want > an example of this you can look at the bnx2x Broadcom driver. To > reproduce this at a minimum you'll need to have: > > 1) Enough dom0 memory so it overlaps with PCI space and gets > remapped by Linux at boot Hm? Could you give a bit details? As in is the: [ 0.000000] Allocating PCI resources starting at 7f800000 (gap: 7f800000:7c800000) value? As in that value should be in the PCI space and I am not sure how your dom0 memory overlaps? If you do say dom0_mem=max:3G the kernel will balloon out of the MMIO regions and the gaps (so PCI space) and put that memory past the 4GB. So the MMIO regions end up being MMIO regions. > 2) A driver that uses dma_map_single/dma_unmap_single OK, > 3) Large enough buffers so that they span page boundaries Um, right, so I think the get_order hack that was posted would help in that so you would not span page boundaries? > > Things that may help with 3 are enabling jumbos and various offload > settings in either guests or dom0. If you booted baremetal with 'iommu=soft swiotlb=force' the same problem should show up - at least based on the 2) and 3) issue. Well, except that there are no guests but one should be able to trigger this. What do you use for driving traffic? iperf with certain parameters? Thanks! > > >>>>>>>--msw > >>>>>>It's the dma streaming api I've noticed the problem with, so > >>>>>>dma_map_single(). Applicable swiotlb code would be > >>>>>>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > >>>>>>for larger buffers it can cause bouncing. > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-31 14:15 ` Konrad Rzeszutek Wilk @ 2014-04-01 3:25 ` Matthew Rushton 2014-04-01 10:48 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-04-01 3:25 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On 03/31/14 07:15, Konrad Rzeszutek Wilk wrote: > On Fri, Mar 28, 2014 at 03:06:23PM -0700, Matthew Rushton wrote: >> On 03/28/14 10:02, Konrad Rzeszutek Wilk wrote: >>> On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: >>>> On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: >>>>> On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: >>>>>> On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: >>>>>>> On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: >>>>>>>> On 03/26/14 08:15, Matt Wilson wrote: >>>>>>>>> On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: >>>>>>>>>> Could you elaborate a bit more on the use-case please? >>>>>>>>>> My understanding is that most drivers use a scatter gather list - in which >>>>>>>>>> case it does not matter if the underlaying MFNs in the PFNs spare are >>>>>>>>>> not contingous. >>>>>>>>>> >>>>>>>>>> But I presume the issue you are hitting is with drivers doing dma_map_page >>>>>>>>>> and the page is not 4KB but rather large (compound page). Is that the >>>>>>>>>> problem you have observed? >>>>>>>>> Drivers are using very large size arguments to dma_alloc_coherent() >>>>>>>>> for things like RX and TX descriptor rings. >>>>>>> Large size like larger than 512kB? That would also cause problems >>>>>>> on baremetal then when swiotlb is activated I believe. >>>>>> I was looking at network IO performance so the buffers would not >>>>>> have been that large. I think large in this context is relative to >>>>>> the 4k page size and the odds of the buffer spanning a page >>>>>> boundary. For context I saw ~5-10% performance increase with guest >>>>>> network throughput by avoiding bounce buffers and also saw dom0 tcp >>>>>> streaming performance go from ~6Gb/s to over 9Gb/s on my test setup >>>>>> with a 10Gb NIC. >>>>> OK, but that would not be the dma_alloc_coherent ones then? That sounds >>>>> more like the generic TCP mechanism allocated 64KB pages instead of 4KB >>>>> and used those. >>>>> >>>>> Did you try looking at this hack that Ian proposed a long time ago >>>>> to verify that it is said problem? >>>>> >>>>> https://lkml.org/lkml/2013/9/4/540 >>>>> >>>> Yes I had seen that and intially had the same reaction but the >>>> change was relatively recent and not relevant. I *think* all the >>>> coherent allocations are ok since the swiotlb makes them contiguous. >>>> The problem comes with the use of the streaming api. As one example >>>> with jumbo frames enabled a driver might use larger rx buffers which >>>> triggers the problem. >>>> >>>> I think the right thing to do is to make the dma streaming api work >>>> better with larger buffers on dom0. That way it works across all >>> OK. >>>> drivers and device types regardless of how they were designed. >>> Can you point me to an example of the DMA streaming API? >>> >>> I am not sure if you mean 'streaming API' as scatter gather operations >>> using DMA API? >>> >>> Is there a particular easy way for me to reproduce this. I have >>> to say I hadn't enabled Jumbo frame on my box since I am not even >>> sure if the switch I have can do it. Is there a idiots-punch-list >>> of how to reproduce this? >>> >>> Thanks! >> By streaming API I'm just referring to drivers that use >> dma_map_single/dma_unmap_single on every buffer instead of using >> coherent allocations. So not related to sg in my case. If you want >> an example of this you can look at the bnx2x Broadcom driver. To >> reproduce this at a minimum you'll need to have: >> >> 1) Enough dom0 memory so it overlaps with PCI space and gets >> remapped by Linux at boot > Hm? Could you give a bit details? As in is the: > > [ 0.000000] Allocating PCI resources starting at 7f800000 (gap: 7f800000:7c800000) > > value? > > As in that value should be in the PCI space and I am not sure > how your dom0 memory overlaps? If you do say dom0_mem=max:3G > the kernel will balloon out of the MMIO regions and the gaps (so PCI space) > and put that memory past the 4GB. So the MMIO regions end up > being MMIO regions. You should see the message from xen_do_chunk() about adding pages back. Something along the lines of: Populating 380000-401fb6 pfn range: 542250 pages added These pages get added in reverse order (mfns reversed) without my proposed Xen change. >> 2) A driver that uses dma_map_single/dma_unmap_single > OK, >> 3) Large enough buffers so that they span page boundaries > Um, right, so I think the get_order hack that was posted would > help in that so you would not span page boundaries? That patch doesn't apply in my case but in principal you're right, any change that would decrease buffers spanning page boundaries would limit bounce buffer usage. >> Things that may help with 3 are enabling jumbos and various offload >> settings in either guests or dom0. > If you booted baremetal with 'iommu=soft swiotlb=force' the same > problem should show up - at least based on the 2) and 3) issue. > > Well, except that there are no guests but one should be able to trigger > this. If that forces the use of bounce buffers than it would be a similar net result if you wanted to see the performance overhead of doing the copies. > What do you use for driving traffic? iperf with certain parameters? I was using netperf. There weren't any magic params to trigger this. I believe with the default tcp stream test I ran into the issue. > > Thanks! Are there any concerns about the proposed Xen change as a reasonable work around for the current implementation? Thank you! >>>>>>>>> --msw >>>>>>>> It's the dma streaming api I've noticed the problem with, so >>>>>>>> dma_map_single(). Applicable swiotlb code would be >>>>>>>> xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes >>>>>>>> for larger buffers it can cause bouncing. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-01 3:25 ` Matthew Rushton @ 2014-04-01 10:48 ` Konrad Rzeszutek Wilk 2014-04-01 12:22 ` Tim Deegan 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-01 10:48 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Matt Wilson, Matt Wilson, Tim Deegan, Jan Beulich, Andrew Cooper, xen-devel On Mon, Mar 31, 2014 at 08:25:43PM -0700, Matthew Rushton wrote: > On 03/31/14 07:15, Konrad Rzeszutek Wilk wrote: > >On Fri, Mar 28, 2014 at 03:06:23PM -0700, Matthew Rushton wrote: > >>On 03/28/14 10:02, Konrad Rzeszutek Wilk wrote: > >>>On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote: > >>>>On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote: > >>>>>On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote: > >>>>>>On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote: > >>>>>>>On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote: > >>>>>>>>On 03/26/14 08:15, Matt Wilson wrote: > >>>>>>>>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > >>>>>>>>>>Could you elaborate a bit more on the use-case please? > >>>>>>>>>>My understanding is that most drivers use a scatter gather list - in which > >>>>>>>>>>case it does not matter if the underlaying MFNs in the PFNs spare are > >>>>>>>>>>not contingous. > >>>>>>>>>> > >>>>>>>>>>But I presume the issue you are hitting is with drivers doing dma_map_page > >>>>>>>>>>and the page is not 4KB but rather large (compound page). Is that the > >>>>>>>>>>problem you have observed? > >>>>>>>>>Drivers are using very large size arguments to dma_alloc_coherent() > >>>>>>>>>for things like RX and TX descriptor rings. > >>>>>>>Large size like larger than 512kB? That would also cause problems > >>>>>>>on baremetal then when swiotlb is activated I believe. > >>>>>>I was looking at network IO performance so the buffers would not > >>>>>>have been that large. I think large in this context is relative to > >>>>>>the 4k page size and the odds of the buffer spanning a page > >>>>>>boundary. For context I saw ~5-10% performance increase with guest > >>>>>>network throughput by avoiding bounce buffers and also saw dom0 tcp > >>>>>>streaming performance go from ~6Gb/s to over 9Gb/s on my test setup > >>>>>>with a 10Gb NIC. > >>>>>OK, but that would not be the dma_alloc_coherent ones then? That sounds > >>>>>more like the generic TCP mechanism allocated 64KB pages instead of 4KB > >>>>>and used those. > >>>>> > >>>>>Did you try looking at this hack that Ian proposed a long time ago > >>>>>to verify that it is said problem? > >>>>> > >>>>>https://lkml.org/lkml/2013/9/4/540 > >>>>> > >>>>Yes I had seen that and intially had the same reaction but the > >>>>change was relatively recent and not relevant. I *think* all the > >>>>coherent allocations are ok since the swiotlb makes them contiguous. > >>>>The problem comes with the use of the streaming api. As one example > >>>>with jumbo frames enabled a driver might use larger rx buffers which > >>>>triggers the problem. > >>>> > >>>>I think the right thing to do is to make the dma streaming api work > >>>>better with larger buffers on dom0. That way it works across all > >>>OK. > >>>>drivers and device types regardless of how they were designed. > >>>Can you point me to an example of the DMA streaming API? > >>> > >>>I am not sure if you mean 'streaming API' as scatter gather operations > >>>using DMA API? > >>> > >>>Is there a particular easy way for me to reproduce this. I have > >>>to say I hadn't enabled Jumbo frame on my box since I am not even > >>>sure if the switch I have can do it. Is there a idiots-punch-list > >>>of how to reproduce this? > >>> > >>>Thanks! > >>By streaming API I'm just referring to drivers that use > >>dma_map_single/dma_unmap_single on every buffer instead of using > >>coherent allocations. So not related to sg in my case. If you want > >>an example of this you can look at the bnx2x Broadcom driver. To > >>reproduce this at a minimum you'll need to have: > >> > >>1) Enough dom0 memory so it overlaps with PCI space and gets > >>remapped by Linux at boot > >Hm? Could you give a bit details? As in is the: > > > >[ 0.000000] Allocating PCI resources starting at 7f800000 (gap: 7f800000:7c800000) > > > >value? > > > >As in that value should be in the PCI space and I am not sure > >how your dom0 memory overlaps? If you do say dom0_mem=max:3G > >the kernel will balloon out of the MMIO regions and the gaps (so PCI space) > >and put that memory past the 4GB. So the MMIO regions end up > >being MMIO regions. > > You should see the message from xen_do_chunk() about adding pages > back. Something along the lines of: > > Populating 380000-401fb6 pfn range: 542250 pages added > > These pages get added in reverse order (mfns reversed) without my > proposed Xen change. > > >>2) A driver that uses dma_map_single/dma_unmap_single > >OK, > >>3) Large enough buffers so that they span page boundaries > >Um, right, so I think the get_order hack that was posted would > >help in that so you would not span page boundaries? > > That patch doesn't apply in my case but in principal you're right, > any change that would decrease buffers spanning page boundaries > would limit bounce buffer usage. > > >>Things that may help with 3 are enabling jumbos and various offload > >>settings in either guests or dom0. > >If you booted baremetal with 'iommu=soft swiotlb=force' the same > >problem should show up - at least based on the 2) and 3) issue. > > > >Well, except that there are no guests but one should be able to trigger > >this. > > If that forces the use of bounce buffers than it would be a similar > net result if you wanted to see the performance overhead of doing > the copies. > > >What do you use for driving traffic? iperf with certain parameters? > > I was using netperf. There weren't any magic params to trigger this. > I believe with the default tcp stream test I ran into the issue. > > > > > >Thanks! > > Are there any concerns about the proposed Xen change as a reasonable > work around for the current implementation? Thank you! So I finally understood what the concern was about it - the balloon mechanics get the pages in worst possible order. I am wondeirng if there is something on the Linux side we can do to tell Xen to give them to use in the proper order? Could we swap the order of xen_do_chunk so it starts from the end and goes to start? Would that help? Or maybe do an array of 512 chunks (I had an prototype patch like that floating around to speed this up)? > > >>>>>>>>>--msw > >>>>>>>>It's the dma streaming api I've noticed the problem with, so > >>>>>>>>dma_map_single(). Applicable swiotlb code would be > >>>>>>>>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes > >>>>>>>>for larger buffers it can cause bouncing. > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-01 10:48 ` Konrad Rzeszutek Wilk @ 2014-04-01 12:22 ` Tim Deegan 2014-04-02 0:17 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Tim Deegan @ 2014-04-01 12:22 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Matt Wilson, Jan Beulich, Andrew Cooper, xen-devel At 06:48 -0400 on 01 Apr (1396331306), Konrad Rzeszutek Wilk wrote: > On Mon, Mar 31, 2014 at 08:25:43PM -0700, Matthew Rushton wrote: > > Are there any concerns about the proposed Xen change as a reasonable > > work around for the current implementation? Thank you! > > So I finally understood what the concern was about it - the balloon > mechanics get the pages in worst possible order. I am wondeirng if there > is something on the Linux side we can do to tell Xen to give them to use > in the proper order? The best way, at least from Xen's point of view, is to explicitly allocate contiguous pages in the cases where it'll make a difference AIUI linux already does this for some classes of dma-able memory. > Could we swap the order of xen_do_chunk so it starts from the end and > goes to start? Would that help? As long as we don't also change the default allocation order in Xen. :) In general, linux shouldn't rely on the order that Xen allocates memory, as that might change later. If the current API can't do what's needed, maybe we can add another allocator hypercall or flag? Cheers, Tim. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-01 12:22 ` Tim Deegan @ 2014-04-02 0:17 ` Matthew Rushton 2014-04-02 7:52 ` Jan Beulich 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-04-02 0:17 UTC (permalink / raw) To: Tim Deegan, Konrad Rzeszutek Wilk Cc: Keir Fraser, Jan Beulich, Matt Wilson, Matt Wilson, Andrew Cooper, xen-devel On 04/01/14 05:22, Tim Deegan wrote: > At 06:48 -0400 on 01 Apr (1396331306), Konrad Rzeszutek Wilk wrote: >> On Mon, Mar 31, 2014 at 08:25:43PM -0700, Matthew Rushton wrote: >>> Are there any concerns about the proposed Xen change as a reasonable >>> work around for the current implementation? Thank you! >> So I finally understood what the concern was about it - the balloon >> mechanics get the pages in worst possible order. I am wondeirng if there >> is something on the Linux side we can do to tell Xen to give them to use >> in the proper order? > The best way, at least from Xen's point of view, is to explicitly > allocate contiguous pages in the cases where it'll make a difference > AIUI linux already does this for some classes of dma-able memory. I'm in agreement that if any change is made to Linux it should be to make as large as possible allocations and back off accordingly. I suppose another approach could be to add a boot option to not reallocate at all. >> Could we swap the order of xen_do_chunk so it starts from the end and >> goes to start? Would that help? > As long as we don't also change the default allocation order in > Xen. :) In general, linux shouldn't rely on the order that Xen > allocates memory, as that might change later. If the current API > can't do what's needed, maybe we can add another allocator > hypercall or flag? Agree on not relying on the order in the long run. A new hypercall or flag seems like overkill right now. The question for me comes down to my proposed change which is more simple and solves the short term problem or investing time in reworking the Linux code to make large allocations. > > Cheers, > > Tim. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-02 0:17 ` Matthew Rushton @ 2014-04-02 7:52 ` Jan Beulich 2014-04-02 10:06 ` Ian Campbell 0 siblings, 1 reply; 55+ messages in thread From: Jan Beulich @ 2014-04-02 7:52 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Andrew Cooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel >>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > On 04/01/14 05:22, Tim Deegan wrote: >> As long as we don't also change the default allocation order in >> Xen. :) In general, linux shouldn't rely on the order that Xen >> allocates memory, as that might change later. If the current API >> can't do what's needed, maybe we can add another allocator >> hypercall or flag? > > Agree on not relying on the order in the long run. A new hypercall or > flag seems like overkill right now. The question for me comes down to my > proposed change which is more simple and solves the short term problem > or investing time in reworking the Linux code to make large allocations. I think it has become pretty clear by now that we'd rather not alter the hypervisor allocator for a purpose like this. Jan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-02 7:52 ` Jan Beulich @ 2014-04-02 10:06 ` Ian Campbell 2014-04-02 10:15 ` Jan Beulich 0 siblings, 1 reply; 55+ messages in thread From: Ian Campbell @ 2014-04-02 10:06 UTC (permalink / raw) To: Jan Beulich Cc: Keir Fraser, Matthew Rushton, Andrew Cooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: > >>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > > On 04/01/14 05:22, Tim Deegan wrote: > >> As long as we don't also change the default allocation order in > >> Xen. :) In general, linux shouldn't rely on the order that Xen > >> allocates memory, as that might change later. If the current API > >> can't do what's needed, maybe we can add another allocator > >> hypercall or flag? > > > > Agree on not relying on the order in the long run. A new hypercall or > > flag seems like overkill right now. The question for me comes down to my > > proposed change which is more simple and solves the short term problem > > or investing time in reworking the Linux code to make large allocations. > > I think it has become pretty clear by now that we'd rather not alter > the hypervisor allocator for a purpose like this. Does it even actually solve the problem? It seems like it is just deferring it until sufficient fragmentation has occurred in the system. All its really done is make the eventual issue much harder to debug. Ian. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-02 10:06 ` Ian Campbell @ 2014-04-02 10:15 ` Jan Beulich 2014-04-02 10:20 ` Ian Campbell 0 siblings, 1 reply; 55+ messages in thread From: Jan Beulich @ 2014-04-02 10:15 UTC (permalink / raw) To: Ian Campbell Cc: Keir Fraser, Matthew Rushton, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel >>> On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: > On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: >> >>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: >> > On 04/01/14 05:22, Tim Deegan wrote: >> >> As long as we don't also change the default allocation order in >> >> Xen. :) In general, linux shouldn't rely on the order that Xen >> >> allocates memory, as that might change later. If the current API >> >> can't do what's needed, maybe we can add another allocator >> >> hypercall or flag? >> > >> > Agree on not relying on the order in the long run. A new hypercall or >> > flag seems like overkill right now. The question for me comes down to my >> > proposed change which is more simple and solves the short term problem >> > or investing time in reworking the Linux code to make large allocations. >> >> I think it has become pretty clear by now that we'd rather not alter >> the hypervisor allocator for a purpose like this. > > Does it even actually solve the problem? It seems like it is just > deferring it until sufficient fragmentation has occurred in the system. > All its really done is make the eventual issue much harder to debug. Wasn't this largely for Dom0 (in which case fragmentation shouldn't matter yet)? Jan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-02 10:15 ` Jan Beulich @ 2014-04-02 10:20 ` Ian Campbell 2014-04-09 22:21 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Ian Campbell @ 2014-04-02 10:20 UTC (permalink / raw) To: Jan Beulich Cc: Keir Fraser, Matthew Rushton, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: > >>> On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: > > On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: > >> >>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > >> > On 04/01/14 05:22, Tim Deegan wrote: > >> >> As long as we don't also change the default allocation order in > >> >> Xen. :) In general, linux shouldn't rely on the order that Xen > >> >> allocates memory, as that might change later. If the current API > >> >> can't do what's needed, maybe we can add another allocator > >> >> hypercall or flag? > >> > > >> > Agree on not relying on the order in the long run. A new hypercall or > >> > flag seems like overkill right now. The question for me comes down to my > >> > proposed change which is more simple and solves the short term problem > >> > or investing time in reworking the Linux code to make large allocations. > >> > >> I think it has become pretty clear by now that we'd rather not alter > >> the hypervisor allocator for a purpose like this. > > > > Does it even actually solve the problem? It seems like it is just > > deferring it until sufficient fragmentation has occurred in the system. > > All its really done is make the eventual issue much harder to debug. > > Wasn't this largely for Dom0 (in which case fragmentation shouldn't > matter yet)? Dom0 ballooning breaks any assumptions you might make about relying on early allocations. Ian. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-02 10:20 ` Ian Campbell @ 2014-04-09 22:21 ` Matthew Rushton 2014-04-10 6:14 ` Jan Beulich 2014-04-11 17:05 ` Konrad Rzeszutek Wilk 0 siblings, 2 replies; 55+ messages in thread From: Matthew Rushton @ 2014-04-09 22:21 UTC (permalink / raw) To: Ian Campbell, Jan Beulich Cc: Keir Fraser, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On 04/02/14 03:20, Ian Campbell wrote: > On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: >>>>> On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: >>> On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: >>>>>>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: >>>>> On 04/01/14 05:22, Tim Deegan wrote: >>>>>> As long as we don't also change the default allocation order in >>>>>> Xen. :) In general, linux shouldn't rely on the order that Xen >>>>>> allocates memory, as that might change later. If the current API >>>>>> can't do what's needed, maybe we can add another allocator >>>>>> hypercall or flag? >>>>> Agree on not relying on the order in the long run. A new hypercall or >>>>> flag seems like overkill right now. The question for me comes down to my >>>>> proposed change which is more simple and solves the short term problem >>>>> or investing time in reworking the Linux code to make large allocations. >>>> I think it has become pretty clear by now that we'd rather not alter >>>> the hypervisor allocator for a purpose like this. OK understood see below. >>> Does it even actually solve the problem? It seems like it is just >>> deferring it until sufficient fragmentation has occurred in the system. >>> All its really done is make the eventual issue much harder to debug. >> Wasn't this largely for Dom0 (in which case fragmentation shouldn't >> matter yet)? > Dom0 ballooning breaks any assumptions you might make about relying on > early allocations. I think you're missing the point. I'm not arguing that this change is a general purpose solution to guarantee that dom0 is contiguous. Fragmentation can exist even if dom0 asks for larger allocations like it should (which the balloon driver does I believe). What the change does do is solve a real problem in the current Linux PCI remapping implementation which happens during dom0 intialization. If the allocation strategy is arbitrary why not make the proposed hypervisor change to make existing Linux implementations behave better and in addition fix the problem in Linux so moving forward things are safe? > Ian. > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-09 22:21 ` Matthew Rushton @ 2014-04-10 6:14 ` Jan Beulich 2014-04-11 20:20 ` Matthew Rushton 2014-04-11 17:05 ` Konrad Rzeszutek Wilk 1 sibling, 1 reply; 55+ messages in thread From: Jan Beulich @ 2014-04-10 6:14 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel >>> On 10.04.14 at 00:21, <mvrushton@gmail.com> wrote: > On 04/02/14 03:20, Ian Campbell wrote: >> Dom0 ballooning breaks any assumptions you might make about relying on >> early allocations. > > I think you're missing the point. I'm not arguing that this change is a > general purpose solution to guarantee that dom0 is contiguous. > Fragmentation can exist even if dom0 asks for larger allocations like it > should (which the balloon driver does I believe). What the change does > do is solve a real problem in the current Linux PCI remapping > implementation which happens during dom0 intialization. If the > allocation strategy is arbitrary why not make the proposed hypervisor > change to make existing Linux implementations behave better and in > addition fix the problem in Linux so moving forward things are safe? Apart from all other arguments speaking against this, did you consider that altering the hypervisor behavior may adversely affect some other Dom0-capable OS? Problems in Linux should, as said before, get fixed in Linux. If older versions are affected, stable backports should subsequently be requested/done. Jan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-10 6:14 ` Jan Beulich @ 2014-04-11 20:20 ` Matthew Rushton 0 siblings, 0 replies; 55+ messages in thread From: Matthew Rushton @ 2014-04-11 20:20 UTC (permalink / raw) To: Jan Beulich Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On 04/09/14 23:14, Jan Beulich wrote: >>>> On 10.04.14 at 00:21, <mvrushton@gmail.com> wrote: >> On 04/02/14 03:20, Ian Campbell wrote: >>> Dom0 ballooning breaks any assumptions you might make about relying on >>> early allocations. >> I think you're missing the point. I'm not arguing that this change is a >> general purpose solution to guarantee that dom0 is contiguous. >> Fragmentation can exist even if dom0 asks for larger allocations like it >> should (which the balloon driver does I believe). What the change does >> do is solve a real problem in the current Linux PCI remapping >> implementation which happens during dom0 intialization. If the >> allocation strategy is arbitrary why not make the proposed hypervisor >> change to make existing Linux implementations behave better and in >> addition fix the problem in Linux so moving forward things are safe? > Apart from all other arguments speaking against this, did you > consider that altering the hypervisor behavior may adversely > affect some other Dom0-capable OS? Sure I've been considering the more intuitive dom0 implementation of allocating memory low to high and looking at things pragmatically. > > Problems in Linux should, as said before, get fixed in Linux. If > older versions are affected, stable backports should subsequently > be requested/done. > > Jan > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-09 22:21 ` Matthew Rushton 2014-04-10 6:14 ` Jan Beulich @ 2014-04-11 17:05 ` Konrad Rzeszutek Wilk 2014-04-11 20:28 ` Matthew Rushton 2014-04-13 21:32 ` Tim Deegan 1 sibling, 2 replies; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-11 17:05 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Jan Beulich, Matt Wilson, Matt Wilson, xen-devel On Wed, Apr 09, 2014 at 03:21:38PM -0700, Matthew Rushton wrote: > On 04/02/14 03:20, Ian Campbell wrote: > >On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: > >>>>>On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: > >>>On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: > >>>>>>>On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > >>>>>On 04/01/14 05:22, Tim Deegan wrote: > >>>>>>As long as we don't also change the default allocation order in > >>>>>>Xen. :) In general, linux shouldn't rely on the order that Xen > >>>>>>allocates memory, as that might change later. If the current API > >>>>>>can't do what's needed, maybe we can add another allocator > >>>>>>hypercall or flag? > >>>>>Agree on not relying on the order in the long run. A new hypercall or > >>>>>flag seems like overkill right now. The question for me comes down to my > >>>>>proposed change which is more simple and solves the short term problem > >>>>>or investing time in reworking the Linux code to make large allocations. > >>>>I think it has become pretty clear by now that we'd rather not alter > >>>>the hypervisor allocator for a purpose like this. > > OK understood see below. > > >>>Does it even actually solve the problem? It seems like it is just > >>>deferring it until sufficient fragmentation has occurred in the system. > >>>All its really done is make the eventual issue much harder to debug. > >>Wasn't this largely for Dom0 (in which case fragmentation shouldn't > >>matter yet)? > >Dom0 ballooning breaks any assumptions you might make about relying on > >early allocations. > > I think you're missing the point. I'm not arguing that this change > is a general purpose solution to guarantee that dom0 is contiguous. > Fragmentation can exist even if dom0 asks for larger allocations > like it should (which the balloon driver does I believe). What the > change does do is solve a real problem in the current Linux PCI > remapping implementation which happens during dom0 intialization. If > the allocation strategy is arbitrary why not make the proposed > hypervisor change to make existing Linux implementations behave > better and in addition fix the problem in Linux so moving forward > things are safe? I think Tim was OK with that - as long as it was based on a flag - meaning when we do the increase_reservation call we use an extra flag to ask for contingous PFNs. > > >Ian. > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-11 17:05 ` Konrad Rzeszutek Wilk @ 2014-04-11 20:28 ` Matthew Rushton 2014-04-12 1:34 ` Konrad Rzeszutek Wilk 2014-04-13 21:32 ` Tim Deegan 1 sibling, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-04-11 20:28 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Jan Beulich, Matt Wilson, Matt Wilson, xen-devel On 04/11/14 10:05, Konrad Rzeszutek Wilk wrote: > On Wed, Apr 09, 2014 at 03:21:38PM -0700, Matthew Rushton wrote: >> On 04/02/14 03:20, Ian Campbell wrote: >>> On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: >>>>>>> On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: >>>>> On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: >>>>>>>>> On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: >>>>>>> On 04/01/14 05:22, Tim Deegan wrote: >>>>>>>> As long as we don't also change the default allocation order in >>>>>>>> Xen. :) In general, linux shouldn't rely on the order that Xen >>>>>>>> allocates memory, as that might change later. If the current API >>>>>>>> can't do what's needed, maybe we can add another allocator >>>>>>>> hypercall or flag? >>>>>>> Agree on not relying on the order in the long run. A new hypercall or >>>>>>> flag seems like overkill right now. The question for me comes down to my >>>>>>> proposed change which is more simple and solves the short term problem >>>>>>> or investing time in reworking the Linux code to make large allocations. >>>>>> I think it has become pretty clear by now that we'd rather not alter >>>>>> the hypervisor allocator for a purpose like this. >> OK understood see below. >> >>>>> Does it even actually solve the problem? It seems like it is just >>>>> deferring it until sufficient fragmentation has occurred in the system. >>>>> All its really done is make the eventual issue much harder to debug. >>>> Wasn't this largely for Dom0 (in which case fragmentation shouldn't >>>> matter yet)? >>> Dom0 ballooning breaks any assumptions you might make about relying on >>> early allocations. >> I think you're missing the point. I'm not arguing that this change >> is a general purpose solution to guarantee that dom0 is contiguous. >> Fragmentation can exist even if dom0 asks for larger allocations >> like it should (which the balloon driver does I believe). What the >> change does do is solve a real problem in the current Linux PCI >> remapping implementation which happens during dom0 intialization. If >> the allocation strategy is arbitrary why not make the proposed >> hypervisor change to make existing Linux implementations behave >> better and in addition fix the problem in Linux so moving forward >> things are safe? > I think Tim was OK with that - as long as it was based on a flag - meaning > when we do the increase_reservation call we use an extra flag > to ask for contingous PFNs. OK the extra flag feels a little dirty to me but it would solve the problem. What are your thoughts on changing Linux to make higher order allocations or more minimally adding a boot parameter to not remap the memory at all for those that care about performance? I know the Linux code is already fairly complex and your preference was not to make it worse. >>> Ian. >>> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-11 20:28 ` Matthew Rushton @ 2014-04-12 1:34 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-12 1:34 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Jan Beulich, Matt Wilson, Matt Wilson, xen-devel On Fri, Apr 11, 2014 at 01:28:45PM -0700, Matthew Rushton wrote: > On 04/11/14 10:05, Konrad Rzeszutek Wilk wrote: > >On Wed, Apr 09, 2014 at 03:21:38PM -0700, Matthew Rushton wrote: > >>On 04/02/14 03:20, Ian Campbell wrote: > >>>On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: > >>>>>>>On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: > >>>>>On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: > >>>>>>>>>On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > >>>>>>>On 04/01/14 05:22, Tim Deegan wrote: > >>>>>>>>As long as we don't also change the default allocation order in > >>>>>>>>Xen. :) In general, linux shouldn't rely on the order that Xen > >>>>>>>>allocates memory, as that might change later. If the current API > >>>>>>>>can't do what's needed, maybe we can add another allocator > >>>>>>>>hypercall or flag? > >>>>>>>Agree on not relying on the order in the long run. A new hypercall or > >>>>>>>flag seems like overkill right now. The question for me comes down to my > >>>>>>>proposed change which is more simple and solves the short term problem > >>>>>>>or investing time in reworking the Linux code to make large allocations. > >>>>>>I think it has become pretty clear by now that we'd rather not alter > >>>>>>the hypervisor allocator for a purpose like this. > >>OK understood see below. > >> > >>>>>Does it even actually solve the problem? It seems like it is just > >>>>>deferring it until sufficient fragmentation has occurred in the system. > >>>>>All its really done is make the eventual issue much harder to debug. > >>>>Wasn't this largely for Dom0 (in which case fragmentation shouldn't > >>>>matter yet)? > >>>Dom0 ballooning breaks any assumptions you might make about relying on > >>>early allocations. > >>I think you're missing the point. I'm not arguing that this change > >>is a general purpose solution to guarantee that dom0 is contiguous. > >>Fragmentation can exist even if dom0 asks for larger allocations > >>like it should (which the balloon driver does I believe). What the > >>change does do is solve a real problem in the current Linux PCI > >>remapping implementation which happens during dom0 intialization. If > >>the allocation strategy is arbitrary why not make the proposed > >>hypervisor change to make existing Linux implementations behave > >>better and in addition fix the problem in Linux so moving forward > >>things are safe? > >I think Tim was OK with that - as long as it was based on a flag - meaning > >when we do the increase_reservation call we use an extra flag > >to ask for contingous PFNs. > > OK the extra flag feels a little dirty to me but it would solve the > problem. What are your thoughts on changing Linux to make higher > order allocations or more minimally adding a boot parameter to not > remap the memory at all for those that care about performance? I Oh, so just leave it ballooned down? I presume you can get the same exact behavior if you have your dom0_mem=max:X value tweaked just right? And your E820 does not look like swiss cheese. > know the Linux code is already fairly complex and your preference > was not to make it worse. > > >>>Ian. > >>> > >> > >>_______________________________________________ > >>Xen-devel mailing list > >>Xen-devel@lists.xen.org > >>http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-11 17:05 ` Konrad Rzeszutek Wilk 2014-04-11 20:28 ` Matthew Rushton @ 2014-04-13 21:32 ` Tim Deegan 2014-04-14 8:51 ` Jan Beulich 1 sibling, 1 reply; 55+ messages in thread From: Tim Deegan @ 2014-04-13 21:32 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Matt Wilson, Matthew Rushton, AndrewCooper, Jan Beulich, Matt Wilson, xen-devel, Ian Campbell At 13:05 -0400 on 11 Apr (1397217936), Konrad Rzeszutek Wilk wrote: > On Wed, Apr 09, 2014 at 03:21:38PM -0700, Matthew Rushton wrote: > > On 04/02/14 03:20, Ian Campbell wrote: > > >On Wed, 2014-04-02 at 11:15 +0100, Jan Beulich wrote: > > >>>>>On 02.04.14 at 12:06, <Ian.Campbell@citrix.com> wrote: > > >>>On Wed, 2014-04-02 at 08:52 +0100, Jan Beulich wrote: > > >>>>>>>On 02.04.14 at 02:17, <mvrushton@gmail.com> wrote: > > >>>>>On 04/01/14 05:22, Tim Deegan wrote: > > >>>>>>As long as we don't also change the default allocation order in > > >>>>>>Xen. :) In general, linux shouldn't rely on the order that Xen > > >>>>>>allocates memory, as that might change later. If the current API > > >>>>>>can't do what's needed, maybe we can add another allocator > > >>>>>>hypercall or flag? > > >>>>>Agree on not relying on the order in the long run. A new hypercall or > > >>>>>flag seems like overkill right now. The question for me comes down to my > > >>>>>proposed change which is more simple and solves the short term problem > > >>>>>or investing time in reworking the Linux code to make large allocations. > > >>>>I think it has become pretty clear by now that we'd rather not alter > > >>>>the hypervisor allocator for a purpose like this. > > > > OK understood see below. > > > > >>>Does it even actually solve the problem? It seems like it is just > > >>>deferring it until sufficient fragmentation has occurred in the system. > > >>>All its really done is make the eventual issue much harder to debug. > > >>Wasn't this largely for Dom0 (in which case fragmentation shouldn't > > >>matter yet)? > > >Dom0 ballooning breaks any assumptions you might make about relying on > > >early allocations. > > > > I think you're missing the point. I'm not arguing that this change > > is a general purpose solution to guarantee that dom0 is contiguous. > > Fragmentation can exist even if dom0 asks for larger allocations > > like it should (which the balloon driver does I believe). What the > > change does do is solve a real problem in the current Linux PCI > > remapping implementation which happens during dom0 intialization. If > > the allocation strategy is arbitrary why not make the proposed > > hypervisor change to make existing Linux implementations behave > > better and in addition fix the problem in Linux so moving forward > > things are safe? > > I think Tim was OK with that - as long as it was based on a flag - meaning > when we do the increase_reservation call we use an extra flag > to ask for contingous PFNs. That's not quite what I meant to say. I think that: (a) Making this change would be OK, as it should be harmless and happens to help people running older linux kernels. That comes with the caveats I mentioned: dom0 should not be relying on this and we (xen) reserve the right to change it later even if that makes unfixed linux dom0 slow again. We also shouldn't make this change on debug builds (to catch cases where a guest relies on the new behaviour for _correctness_). (b) The right thing to do is to fix linux so that it asks for contiguous memory in cases where that matters. AFAICT that would involve allocating in larger areas than 1 page. (c) If for some reason the current hypercall API is not sufficient for dom0 to get what it wants, we should consider adding some new operation/flag/mode somewhere. But since AFAIK there's already another path in linux that allocates contiguous DMA buffers for device drivers, presumably this isn't the case. Cheers, Tim. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-13 21:32 ` Tim Deegan @ 2014-04-14 8:51 ` Jan Beulich 2014-04-14 14:40 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Jan Beulich @ 2014-04-14 8:51 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, Tim Deegan Cc: Keir Fraser, Ian Campbell, Matthew Rushton, AndrewCooper, Matt Wilson, Matt Wilson, xen-devel >>> On 13.04.14 at 23:32, <tim@xen.org> wrote: > (c) If for some reason the current hypercall API is not sufficient > for dom0 to get what it wants, we should consider adding some new > operation/flag/mode somewhere. But since AFAIK there's already > another path in linux that allocates contiguous DMA buffers > for device drivers, presumably this isn't the case. And it should be kept in mind that requesting contiguous memory shouldn't be done at will, as it may end up exhausting the portion of memory intended for DMA-style allocations (SWIOTLB / DMA- coherent allocations in Linux). I.e. neither Dom0 nor DomU should be trying to populate large parts of their memory with contiguous allocation requests to the hypervisor. They may, if they so desire, go and re-arrange their P2M mapping (solely based on what they got handed by doing order-0 allocations). Jan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-14 8:51 ` Jan Beulich @ 2014-04-14 14:40 ` Konrad Rzeszutek Wilk 2014-04-14 15:34 ` Jan Beulich 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-14 14:40 UTC (permalink / raw) To: Jan Beulich Cc: Keir Fraser, Ian Campbell, Matthew Rushton, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On Mon, Apr 14, 2014 at 09:51:34AM +0100, Jan Beulich wrote: > >>> On 13.04.14 at 23:32, <tim@xen.org> wrote: > > (c) If for some reason the current hypercall API is not sufficient > > for dom0 to get what it wants, we should consider adding some new > > operation/flag/mode somewhere. But since AFAIK there's already > > another path in linux that allocates contiguous DMA buffers > > for device drivers, presumably this isn't the case. > > And it should be kept in mind that requesting contiguous memory > shouldn't be done at will, as it may end up exhausting the portion > of memory intended for DMA-style allocations (SWIOTLB / DMA- > coherent allocations in Linux). I.e. neither Dom0 nor DomU should > be trying to populate large parts of their memory with contiguous > allocation requests to the hypervisor. They may, if they so desire, > go and re-arrange their P2M mapping (solely based on what they > got handed by doing order-0 allocations). I did try that at some point - and it did not work. The reason for trying this was that during the E820 parsing we would find the MMIO holes/gaps and instead of doing the 'XENMEM_decrease_reservation'/ 'XENMEM_populate_physmap' dance I thought I could just swap the P2M entries. That was OK, but the M2P lookup table was not too thrilled with this. Perhaps I should have used another hypercall to re-arrange the M2P? I think I did try 'XENMEM_exchange' but that is not the right call either. Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap combo ? > > Jan > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-14 14:40 ` Konrad Rzeszutek Wilk @ 2014-04-14 15:34 ` Jan Beulich 2014-04-16 14:15 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Jan Beulich @ 2014-04-14 15:34 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Ian Campbell, Matthew Rushton, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel >>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: > That was OK, but the M2P lookup table was not too thrilled with this. > Perhaps I should have used another hypercall to re-arrange the M2P? > I think I did try 'XENMEM_exchange' but that is not the right call either. Yeah, that's allocating new pages in exchange for your old ones. Not really what you want. > Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap > combo ? A pair of MMU_MACHPHYS_UPDATE operations would seem to be the right way of doing this (along with respective kernel internal accounting like set_phys_to_machine(), and perhaps a pair of update_va_mapping operations if the 1:1 map is already in place at that time, and you care about which page contents appears at which virtual address). Jan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-14 15:34 ` Jan Beulich @ 2014-04-16 14:15 ` Konrad Rzeszutek Wilk 2014-04-17 1:34 ` Matthew Rushton 2014-05-07 23:16 ` Matthew Rushton 0 siblings, 2 replies; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-16 14:15 UTC (permalink / raw) Cc: Keir Fraser, Ian Campbell, Matthew Rushton, AndrewCooper, Tim Deegan, Matt Wilson, Matt Wilson, xen-devel On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: > >>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: > > That was OK, but the M2P lookup table was not too thrilled with this. > > Perhaps I should have used another hypercall to re-arrange the M2P? > > I think I did try 'XENMEM_exchange' but that is not the right call either. > > Yeah, that's allocating new pages in exchange for your old ones. Not > really what you want. > > > Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap > > combo ? > > A pair of MMU_MACHPHYS_UPDATE operations would seem to be the > right way of doing this (along with respective kernel internal accounting > like set_phys_to_machine(), and perhaps a pair of update_va_mapping > operations if the 1:1 map is already in place at that time, and you care > about which page contents appears at which virtual address). OK. Matt & Matthew - my plate is quite filled and I fear that in the next three weeks there is not going to be much time to code up a prototype. Would either one of you be willing to take a crack at this? It would be neat as we could remove a lot of the balloon increase/decrease code in arch/x86/xen/setup.c. Thanks! > > Jan > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-16 14:15 ` Konrad Rzeszutek Wilk @ 2014-04-17 1:34 ` Matthew Rushton 2014-05-07 23:16 ` Matthew Rushton 1 sibling, 0 replies; 55+ messages in thread From: Matthew Rushton @ 2014-04-17 1:34 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, msw Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Matt Wilson, xen-devel On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: > On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: >>>>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: >>> That was OK, but the M2P lookup table was not too thrilled with this. >>> Perhaps I should have used another hypercall to re-arrange the M2P? >>> I think I did try 'XENMEM_exchange' but that is not the right call either. >> Yeah, that's allocating new pages in exchange for your old ones. Not >> really what you want. >> >>> Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap >>> combo ? >> A pair of MMU_MACHPHYS_UPDATE operations would seem to be the >> right way of doing this (along with respective kernel internal accounting >> like set_phys_to_machine(), and perhaps a pair of update_va_mapping >> operations if the 1:1 map is already in place at that time, and you care >> about which page contents appears at which virtual address). > OK. > > Matt & Matthew - my plate is quite filled and I fear that in the next three > weeks there is not going to be much time to code up a prototype. > > Would either one of you be willing to take a crack at this? It would > be neat as we could remove a lot of the balloon increase/decrease code > in arch/x86/xen/setup.c. > > Thanks! Yeah sure. It won't be immediatly but I should be able to do that. >> Jan >> ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-04-16 14:15 ` Konrad Rzeszutek Wilk 2014-04-17 1:34 ` Matthew Rushton @ 2014-05-07 23:16 ` Matthew Rushton 2014-05-08 18:05 ` Konrad Rzeszutek Wilk 2014-05-14 15:06 ` Konrad Rzeszutek Wilk 1 sibling, 2 replies; 55+ messages in thread From: Matthew Rushton @ 2014-05-07 23:16 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, msw Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, Matt Wilson, xen-devel On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: > On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: >>>>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: >>> That was OK, but the M2P lookup table was not too thrilled with this. >>> Perhaps I should have used another hypercall to re-arrange the M2P? >>> I think I did try 'XENMEM_exchange' but that is not the right call either. >> Yeah, that's allocating new pages in exchange for your old ones. Not >> really what you want. >> >>> Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap >>> combo ? >> A pair of MMU_MACHPHYS_UPDATE operations would seem to be the >> right way of doing this (along with respective kernel internal accounting >> like set_phys_to_machine(), and perhaps a pair of update_va_mapping >> operations if the 1:1 map is already in place at that time, and you care >> about which page contents appears at which virtual address). > OK. > > Matt & Matthew - my plate is quite filled and I fear that in the next three > weeks there is not going to be much time to code up a prototype. > > Would either one of you be willing to take a crack at this? It would > be neat as we could remove a lot of the balloon increase/decrease code > in arch/x86/xen/setup.c. > > Thanks! I have a first pass at this. Just need to test it and should have something ready sometime next week or so. >> Jan >> ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-05-07 23:16 ` Matthew Rushton @ 2014-05-08 18:05 ` Konrad Rzeszutek Wilk 2014-05-14 15:06 ` Konrad Rzeszutek Wilk 1 sibling, 0 replies; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-05-08 18:05 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, msw, Matt Wilson, xen-devel On Wed, May 07, 2014 at 04:16:14PM -0700, Matthew Rushton wrote: > On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: > >On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: > >>>>>On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: > >>>That was OK, but the M2P lookup table was not too thrilled with this. > >>>Perhaps I should have used another hypercall to re-arrange the M2P? > >>>I think I did try 'XENMEM_exchange' but that is not the right call either. > >>Yeah, that's allocating new pages in exchange for your old ones. Not > >>really what you want. > >> > >>>Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap > >>>combo ? > >>A pair of MMU_MACHPHYS_UPDATE operations would seem to be the > >>right way of doing this (along with respective kernel internal accounting > >>like set_phys_to_machine(), and perhaps a pair of update_va_mapping > >>operations if the 1:1 map is already in place at that time, and you care > >>about which page contents appears at which virtual address). > >OK. > > > >Matt & Matthew - my plate is quite filled and I fear that in the next three > >weeks there is not going to be much time to code up a prototype. > > > >Would either one of you be willing to take a crack at this? It would > >be neat as we could remove a lot of the balloon increase/decrease code > >in arch/x86/xen/setup.c. > > > >Thanks! > > I have a first pass at this. Just need to test it and should have something > ready sometime next week or so. Woohoo! Thanks for the update! > > >>Jan > >> > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-05-07 23:16 ` Matthew Rushton 2014-05-08 18:05 ` Konrad Rzeszutek Wilk @ 2014-05-14 15:06 ` Konrad Rzeszutek Wilk 2014-05-20 19:26 ` Matthew Rushton 1 sibling, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-05-14 15:06 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, msw, Matt Wilson, xen-devel On Wed, May 07, 2014 at 04:16:14PM -0700, Matthew Rushton wrote: > On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: > >On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: > >>>>>On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: > >>>That was OK, but the M2P lookup table was not too thrilled with this. > >>>Perhaps I should have used another hypercall to re-arrange the M2P? > >>>I think I did try 'XENMEM_exchange' but that is not the right call either. > >>Yeah, that's allocating new pages in exchange for your old ones. Not > >>really what you want. > >> > >>>Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap > >>>combo ? > >>A pair of MMU_MACHPHYS_UPDATE operations would seem to be the > >>right way of doing this (along with respective kernel internal accounting > >>like set_phys_to_machine(), and perhaps a pair of update_va_mapping > >>operations if the 1:1 map is already in place at that time, and you care > >>about which page contents appears at which virtual address). > >OK. > > > >Matt & Matthew - my plate is quite filled and I fear that in the next three > >weeks there is not going to be much time to code up a prototype. > > > >Would either one of you be willing to take a crack at this? It would > >be neat as we could remove a lot of the balloon increase/decrease code > >in arch/x86/xen/setup.c. > > > >Thanks! > > I have a first pass at this. Just need to test it and should have something > ready sometime next week or so. Daniel pointed me to this commit: ommit 2e2fb75475c2fc74c98100f1468c8195fee49f3b Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Fri Apr 6 10:07:11 2012 -0400 xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM .. The other solution (that did not work) was to transplant the MFN in the P2M tree - the ones that were going to be freed were put in the E820_RAM regions past the nr_pages. But the modifications to the M2P array (the other side of creating PTEs) were not carried away. As the hypervisor is the only one capable of modifying that and the only two hypercalls that would do this are: the update_va_mapping (which won't work, as during initial bootup only PFNs up to nr_pages are mapped in the guest) or via the populate hypercall. Where I talk about the 'update_va_mapping' - and I seem to think that it would not work (due to the nr_pages limit). I don't actually remember the details - so I might have been incorrect (hopefully!?). > > >>Jan > >> > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-05-14 15:06 ` Konrad Rzeszutek Wilk @ 2014-05-20 19:26 ` Matthew Rushton 2014-05-23 19:00 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-05-20 19:26 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, mrushton, msw, Matt Wilson, xen-devel On 05/14/14 08:06, Konrad Rzeszutek Wilk wrote: > On Wed, May 07, 2014 at 04:16:14PM -0700, Matthew Rushton wrote: >> On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: >>> On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: >>>>>>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: >>>>> That was OK, but the M2P lookup table was not too thrilled with this. >>>>> Perhaps I should have used another hypercall to re-arrange the M2P? >>>>> I think I did try 'XENMEM_exchange' but that is not the right call either. >>>> Yeah, that's allocating new pages in exchange for your old ones. Not >>>> really what you want. >>>> >>>>> Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap >>>>> combo ? >>>> A pair of MMU_MACHPHYS_UPDATE operations would seem to be the >>>> right way of doing this (along with respective kernel internal accounting >>>> like set_phys_to_machine(), and perhaps a pair of update_va_mapping >>>> operations if the 1:1 map is already in place at that time, and you care >>>> about which page contents appears at which virtual address). >>> OK. >>> >>> Matt & Matthew - my plate is quite filled and I fear that in the next three >>> weeks there is not going to be much time to code up a prototype. >>> >>> Would either one of you be willing to take a crack at this? It would >>> be neat as we could remove a lot of the balloon increase/decrease code >>> in arch/x86/xen/setup.c. >>> >>> Thanks! >> I have a first pass at this. Just need to test it and should have something >> ready sometime next week or so. > Daniel pointed me to this commit: > ommit 2e2fb75475c2fc74c98100f1468c8195fee49f3b > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Fri Apr 6 10:07:11 2012 -0400 > > xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM > .. > The other solution (that did not work) was to transplant the MFN in > the P2M tree - the ones that were going to be freed were put in > the E820_RAM regions past the nr_pages. But the modifications to the > M2P array (the other side of creating PTEs) were not carried away. > As the hypervisor is the only one capable of modifying that and the > only two hypercalls that would do this are: the update_va_mapping > (which won't work, as during initial bootup only PFNs up to nr_pages > are mapped in the guest) or via the populate hypercall. > > Where I talk about the 'update_va_mapping' - and I seem to think > that it would not work (due to the nr_pages limit). I don't actually > remember the details - so I might have been incorrect (hopefully!?). > Ok I finally have something I'm happy with using the mmu_update hypercall and placing things in the existing E820 map. I don't think the update_va_mapping hypercall is necessary. It ended up being a little more complicated than I originally thought to handle not allocating additional p2m leaf nodes. I'm going on vacation here shortly and can post it when I get back. >>>> Jan >>>> ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-05-20 19:26 ` Matthew Rushton @ 2014-05-23 19:00 ` Konrad Rzeszutek Wilk 2014-06-04 22:25 ` Matthew Rushton 0 siblings, 1 reply; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-05-23 19:00 UTC (permalink / raw) To: Matthew Rushton Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, mrushton, msw, Matt Wilson, xen-devel On Tue, May 20, 2014 at 12:26:57PM -0700, Matthew Rushton wrote: > On 05/14/14 08:06, Konrad Rzeszutek Wilk wrote: > >On Wed, May 07, 2014 at 04:16:14PM -0700, Matthew Rushton wrote: > >>On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: > >>>On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: > >>>>>>>On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: > >>>>>That was OK, but the M2P lookup table was not too thrilled with this. > >>>>>Perhaps I should have used another hypercall to re-arrange the M2P? > >>>>>I think I did try 'XENMEM_exchange' but that is not the right call either. > >>>>Yeah, that's allocating new pages in exchange for your old ones. Not > >>>>really what you want. > >>>> > >>>>>Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap > >>>>>combo ? > >>>>A pair of MMU_MACHPHYS_UPDATE operations would seem to be the > >>>>right way of doing this (along with respective kernel internal accounting > >>>>like set_phys_to_machine(), and perhaps a pair of update_va_mapping > >>>>operations if the 1:1 map is already in place at that time, and you care > >>>>about which page contents appears at which virtual address). > >>>OK. > >>> > >>>Matt & Matthew - my plate is quite filled and I fear that in the next three > >>>weeks there is not going to be much time to code up a prototype. > >>> > >>>Would either one of you be willing to take a crack at this? It would > >>>be neat as we could remove a lot of the balloon increase/decrease code > >>>in arch/x86/xen/setup.c. > >>> > >>>Thanks! > >>I have a first pass at this. Just need to test it and should have something > >>ready sometime next week or so. > >Daniel pointed me to this commit: > >ommit 2e2fb75475c2fc74c98100f1468c8195fee49f3b > >Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > >Date: Fri Apr 6 10:07:11 2012 -0400 > > > > xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM > >.. > > The other solution (that did not work) was to transplant the MFN in > > the P2M tree - the ones that were going to be freed were put in > > the E820_RAM regions past the nr_pages. But the modifications to the > > M2P array (the other side of creating PTEs) were not carried away. > > As the hypervisor is the only one capable of modifying that and the > > only two hypercalls that would do this are: the update_va_mapping > > (which won't work, as during initial bootup only PFNs up to nr_pages > > are mapped in the guest) or via the populate hypercall. > > > >Where I talk about the 'update_va_mapping' - and I seem to think > >that it would not work (due to the nr_pages limit). I don't actually > >remember the details - so I might have been incorrect (hopefully!?). > > > > Ok I finally have something I'm happy with using the mmu_update hypercall > and placing things in the existing E820 map. I don't think the > update_va_mapping hypercall is necessary. It ended up being a little more > complicated than I originally thought to handle not allocating additional > p2m leaf nodes. I'm going on vacation here shortly and can post it when I > get back. Enjoy the vacation and I am looking forward to seeing the patches when you come back! Thank you! > > >>>>Jan > >>>> > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-05-23 19:00 ` Konrad Rzeszutek Wilk @ 2014-06-04 22:25 ` Matthew Rushton 2014-06-05 9:32 ` David Vrabel 0 siblings, 1 reply; 55+ messages in thread From: Matthew Rushton @ 2014-06-04 22:25 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, mrushton, msw, Matt Wilson, xen-devel On 05/23/14 12:00, Konrad Rzeszutek Wilk wrote: > On Tue, May 20, 2014 at 12:26:57PM -0700, Matthew Rushton wrote: >> On 05/14/14 08:06, Konrad Rzeszutek Wilk wrote: >>> On Wed, May 07, 2014 at 04:16:14PM -0700, Matthew Rushton wrote: >>>> On 04/16/14 07:15, Konrad Rzeszutek Wilk wrote: >>>>> On Mon, Apr 14, 2014 at 04:34:47PM +0100, Jan Beulich wrote: >>>>>>>>> On 14.04.14 at 16:40, <konrad.wilk@oracle.com> wrote: >>>>>>> That was OK, but the M2P lookup table was not too thrilled with this. >>>>>>> Perhaps I should have used another hypercall to re-arrange the M2P? >>>>>>> I think I did try 'XENMEM_exchange' but that is not the right call either. >>>>>> Yeah, that's allocating new pages in exchange for your old ones. Not >>>>>> really what you want. >>>>>> >>>>>>> Perhaps I should use XENMEM_remove_from_physmap/XENMEM_add_to_physmap >>>>>>> combo ? >>>>>> A pair of MMU_MACHPHYS_UPDATE operations would seem to be the >>>>>> right way of doing this (along with respective kernel internal accounting >>>>>> like set_phys_to_machine(), and perhaps a pair of update_va_mapping >>>>>> operations if the 1:1 map is already in place at that time, and you care >>>>>> about which page contents appears at which virtual address). >>>>> OK. >>>>> >>>>> Matt & Matthew - my plate is quite filled and I fear that in the next three >>>>> weeks there is not going to be much time to code up a prototype. >>>>> >>>>> Would either one of you be willing to take a crack at this? It would >>>>> be neat as we could remove a lot of the balloon increase/decrease code >>>>> in arch/x86/xen/setup.c. >>>>> >>>>> Thanks! >>>> I have a first pass at this. Just need to test it and should have something >>>> ready sometime next week or so. >>> Daniel pointed me to this commit: >>> ommit 2e2fb75475c2fc74c98100f1468c8195fee49f3b >>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> Date: Fri Apr 6 10:07:11 2012 -0400 >>> >>> xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM >>> .. >>> The other solution (that did not work) was to transplant the MFN in >>> the P2M tree - the ones that were going to be freed were put in >>> the E820_RAM regions past the nr_pages. But the modifications to the >>> M2P array (the other side of creating PTEs) were not carried away. >>> As the hypervisor is the only one capable of modifying that and the >>> only two hypercalls that would do this are: the update_va_mapping >>> (which won't work, as during initial bootup only PFNs up to nr_pages >>> are mapped in the guest) or via the populate hypercall. >>> >>> Where I talk about the 'update_va_mapping' - and I seem to think >>> that it would not work (due to the nr_pages limit). I don't actually >>> remember the details - so I might have been incorrect (hopefully!?). >>> >> Ok I finally have something I'm happy with using the mmu_update hypercall >> and placing things in the existing E820 map. I don't think the >> update_va_mapping hypercall is necessary. It ended up being a little more >> complicated than I originally thought to handle not allocating additional >> p2m leaf nodes. I'm going on vacation here shortly and can post it when I >> get back. > Enjoy the vacation and I am looking forward to seeing the patches when you > come back! > > Thank you! Sent patch to lkml late last week ([PATCH] xen/setup: Remap Xen Identity Mapped RAM). Will cc xen-devel on further correspondance. -Matt >>>>>> Jan >>>>>> ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-06-04 22:25 ` Matthew Rushton @ 2014-06-05 9:32 ` David Vrabel 0 siblings, 0 replies; 55+ messages in thread From: David Vrabel @ 2014-06-05 9:32 UTC (permalink / raw) To: Matthew Rushton, Konrad Rzeszutek Wilk Cc: Keir Fraser, Ian Campbell, AndrewCooper, Tim Deegan, mrushton, msw, Matt Wilson, xen-devel On 04/06/14 23:25, Matthew Rushton wrote: > > Sent patch to lkml late last week ([PATCH] xen/setup: Remap Xen Identity > Mapped RAM). Will cc xen-devel on further correspondance. Please repost, Cc'ing the Linux Xen maintainers and xen-devel. David ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [RFC PATCH] page_alloc: use first half of higher order chunks when halving 2014-03-26 15:15 ` Matt Wilson 2014-03-26 15:59 ` Matthew Rushton @ 2014-03-26 16:34 ` Konrad Rzeszutek Wilk 1 sibling, 0 replies; 55+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-03-26 16:34 UTC (permalink / raw) To: Matt Wilson Cc: Keir Fraser, Matt Wilson, Matthew Rushton, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel On Wed, Mar 26, 2014 at 05:15:08PM +0200, Matt Wilson wrote: > On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote: > > > > Could you elaborate a bit more on the use-case please? > > My understanding is that most drivers use a scatter gather list - in which > > case it does not matter if the underlaying MFNs in the PFNs spare are > > not contingous. > > > > But I presume the issue you are hitting is with drivers doing dma_map_page > > and the page is not 4KB but rather large (compound page). Is that the > > problem you have observed? > > Drivers are using very large size arguments to dma_alloc_coherent() > for things like RX and TX descriptor rings. OK, but that call ends up using chunks from the SWIOTLB buffer which is contingously allocated. That shouldn't be a problem? > > --msw ^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2014-06-05 9:32 UTC | newest] Thread overview: 55+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-25 11:22 [RFC PATCH] page_alloc: use first half of higher order chunks when halving Matt Wilson 2014-03-25 11:44 ` Andrew Cooper 2014-03-25 13:20 ` Matt Wilson 2014-03-25 20:18 ` Matthew Rushton 2014-03-25 12:19 ` Tim Deegan 2014-03-25 13:27 ` Matt Wilson 2014-03-25 20:09 ` Matthew Rushton 2014-03-26 9:55 ` Tim Deegan 2014-03-26 10:17 ` Matt Wilson 2014-03-26 10:44 ` David Vrabel 2014-03-26 10:48 ` Matt Wilson 2014-03-26 11:13 ` Ian Campbell 2014-03-26 11:41 ` Matt Wilson 2014-03-26 11:45 ` Andrew Cooper 2014-03-26 11:50 ` Matt Wilson 2014-03-26 12:43 ` David Vrabel 2014-03-26 12:48 ` Matt Wilson 2014-03-26 15:08 ` Konrad Rzeszutek Wilk 2014-03-26 15:15 ` Matt Wilson 2014-03-26 15:59 ` Matthew Rushton 2014-03-26 16:36 ` Konrad Rzeszutek Wilk 2014-03-26 17:47 ` Matthew Rushton 2014-03-26 17:56 ` Konrad Rzeszutek Wilk 2014-03-26 22:15 ` Matthew Rushton 2014-03-28 17:02 ` Konrad Rzeszutek Wilk 2014-03-28 22:06 ` Matthew Rushton 2014-03-31 14:15 ` Konrad Rzeszutek Wilk 2014-04-01 3:25 ` Matthew Rushton 2014-04-01 10:48 ` Konrad Rzeszutek Wilk 2014-04-01 12:22 ` Tim Deegan 2014-04-02 0:17 ` Matthew Rushton 2014-04-02 7:52 ` Jan Beulich 2014-04-02 10:06 ` Ian Campbell 2014-04-02 10:15 ` Jan Beulich 2014-04-02 10:20 ` Ian Campbell 2014-04-09 22:21 ` Matthew Rushton 2014-04-10 6:14 ` Jan Beulich 2014-04-11 20:20 ` Matthew Rushton 2014-04-11 17:05 ` Konrad Rzeszutek Wilk 2014-04-11 20:28 ` Matthew Rushton 2014-04-12 1:34 ` Konrad Rzeszutek Wilk 2014-04-13 21:32 ` Tim Deegan 2014-04-14 8:51 ` Jan Beulich 2014-04-14 14:40 ` Konrad Rzeszutek Wilk 2014-04-14 15:34 ` Jan Beulich 2014-04-16 14:15 ` Konrad Rzeszutek Wilk 2014-04-17 1:34 ` Matthew Rushton 2014-05-07 23:16 ` Matthew Rushton 2014-05-08 18:05 ` Konrad Rzeszutek Wilk 2014-05-14 15:06 ` Konrad Rzeszutek Wilk 2014-05-20 19:26 ` Matthew Rushton 2014-05-23 19:00 ` Konrad Rzeszutek Wilk 2014-06-04 22:25 ` Matthew Rushton 2014-06-05 9:32 ` David Vrabel 2014-03-26 16:34 ` Konrad Rzeszutek Wilk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).