xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
@ 2010-02-08 18:13 Dan Magenheimer
  2010-02-08 19:11 ` Keir Fraser
  2010-02-09 13:13 ` Jan Beulich
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Magenheimer @ 2010-02-08 18:13 UTC (permalink / raw)
  To: xen-devel, Grzegorz Milos, Patrick Colp, Andrew Peace,
	George Dunlap
  Cc: Ian Pratt, Keir Fraser, Jan Beulich

In a recent thread:

http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00295.html

Jan Beulich points out that the memory fragmentation that results
from Transcendent Memory ("tmem") sometimes causes problems for
domain creation and PV migration because the shadow code requires
order=2 allocations and the domain struct is order=4.

Though tmem accelerates fragmentation, I *think* this fragmentation
can occur with page sharing/swapping, and possibly PoD.  In fact,
I think it can occur even with just ballooning.

I think the domain struct issue should be relatively easy to
resolve (though maybe with a large patch), but the shadow code
may be much harder.

But unless the shadow code is also fixed, theoretically 75% of RAM
could be "free" but domain creation/migration failures may occur,
reported only as insufficient memory.

Clearly it's too late to fix this for 4.0 but, given that 4.0-based
product announcements are likely to emphasize the new 4.0 memory
optimization technologies, it might be good to resolve it very
early in 4.1/xen-unstable development.

Comments?

Are there other known order>0 allocations that might result
in similar issues?

Thanks,
Dan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-08 18:13 Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features Dan Magenheimer
@ 2010-02-08 19:11 ` Keir Fraser
  2010-02-09 10:50   ` Tim Deegan
  2010-02-09 13:13 ` Jan Beulich
  1 sibling, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2010-02-08 19:11 UTC (permalink / raw)
  To: Dan Magenheimer, xen-devel@lists.xensource.com, Grzegorz Milos,
	Patrick Colp, Andre
  Cc: Ian Pratt, Jan Beulich

On 08/02/2010 18:13, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

> In a recent thread:
> 
> http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00295.html
> 
> Jan Beulich points out that the memory fragmentation that results
> from Transcendent Memory ("tmem") sometimes causes problems for
> domain creation and PV migration because the shadow code requires
> order=2 allocations and the domain struct is order=4.
> 
> Though tmem accelerates fragmentation, I *think* this fragmentation
> can occur with page sharing/swapping, and possibly PoD.  In fact,
> I think it can occur even with just ballooning.
> 
> I think the domain struct issue should be relatively easy to
> resolve (though maybe with a large patch), but the shadow code
> may be much harder.

I think everything but the shadow use of order-2 allocations is pretty easy
to fix; just a case of logically carving up the multi-page structures.

I'm not sure, but suspect that fragmenting the shadow allocations may
require extra book-keeping space in the page_info structure, for example.

 -- Keir

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-08 19:11 ` Keir Fraser
@ 2010-02-09 10:50   ` Tim Deegan
  0 siblings, 0 replies; 8+ messages in thread
From: Tim Deegan @ 2010-02-09 10:50 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Dan Magenheimer, xen-devel@lists.xensource.com, George Dunlap,
	Ian Pratt, Jan Beulich, Patrick Colp, Milos, Grzegorz,
	Andrew Peace

At 19:11 +0000 on 08 Feb (1265656284), Keir Fraser wrote:
> I think everything but the shadow use of order-2 allocations is pretty easy
> to fix; just a case of logically carving up the multi-page structures.
> 
> I'm not sure, but suspect that fragmenting the shadow allocations may
> require extra book-keeping space in the page_info structure, for example.

I think it can be done without expanding page_info, but it involves
pretty invasive changes to all the parts of the shadow code that deal
with guest<->shadow mappings and with iterating across shadows.

As I said in response to an earlier email, I think the right thing to do
here is to move pretty much all allocation in Xen to superpage
granularity (and not just because it makes this issue go away!).

Tim

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-08 18:13 Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features Dan Magenheimer
  2010-02-08 19:11 ` Keir Fraser
@ 2010-02-09 13:13 ` Jan Beulich
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2010-02-09 13:13 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: xen-devel, Tim Deegan, GeorgeDunlap, PatrickColp, Ian Pratt,
	Andrew Peace, Keir Fraser, Grzegorz Milos

>>> Dan Magenheimer <dan.magenheimer@oracle.com> 08.02.10 19:13 >>>
>Are there other known order>0 allocations that might result
>in similar issues?

The passthrough code has data structures which dynamically (nr_irqs-
based) can exceed a page (see pt_irq_create_bind_vtd()).

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
@ 2010-02-18  8:04 Jan Beulich
  2010-02-18 16:09 ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-02-18  8:04 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: xen-devel, Tim Deegan, GeorgeDunlap, PatrickColp, Ian Pratt,
	Andrew Peace, Keir Fraser, Grzegorz Milos

>>> Dan Magenheimer <dan.magenheimer@oracle.com> 08.02.10 19:13 >>>
>Are there other known order>0 allocations that might result
>in similar issues?

Interestingly, tmem itself indirectly causes order-1 allocations (through
the use of xmem_pool_create(), sizeof(struct xmem_pool) = 0x18d0
on a non-debug build).

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-18  8:04 Jan Beulich
@ 2010-02-18 16:09 ` Dan Magenheimer
  2010-02-18 16:18   ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Magenheimer @ 2010-02-18 16:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Tim Deegan, GeorgeDunlap, PatrickColp, Ian Pratt,
	Andrew Peace, Keir Fraser, Grzegorz Milos

> From: Jan Beulich [mailto:JBeulich@novell.com]
> >>> Dan Magenheimer <dan.magenheimer@oracle.com> 08.02.10 19:13 >>>
> >Are there other known order>0 allocations that might result
> >in similar issues?
> 
> Interestingly, tmem itself indirectly causes order-1 allocations
> (through
> the use of xmem_pool_create(), sizeof(struct xmem_pool) = 0x18d0
> on a non-debug build).

Well that's embarrassing :-}

But to be fair:
1) tmem is just using xen infrastructure code that suffers from the
   same pervasive problem (though admittedly I added the interface
   to xmalloc_tlsf to enable additional xmem pools to be created)
2) tmem fails semi-gracefully by just turning itself off for
   a domain that fails this order-1 allocation (though it really
   need only disable persistent pools, not all tmem pools)

But ignoring my flimsy excuses, Jan, do you have some debug code
you are using to identify order>0 allocations?  If so, could I
have a copy... and perhaps Keir would consider adding
it post-4.0 to make it easier to search-and-destroy.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-18 16:09 ` Dan Magenheimer
@ 2010-02-18 16:18   ` Jan Beulich
  2010-02-18 17:32     ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-02-18 16:18 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: xen-devel, Tim Deegan, GeorgeDunlap, PatrickColp, Ian Pratt,
	Andrew Peace, Keir Fraser, Grzegorz Milos

>>> Dan Magenheimer <dan.magenheimer@oracle.com> 18.02.10 17:09 >>>
>But ignoring my flimsy excuses, Jan, do you have some debug code
>you are using to identify order>0 allocations?  If so, could I
>have a copy... and perhaps Keir would consider adding
>it post-4.0 to make it easier to search-and-destroy.

I actually noticed this only as a side effect from a much uglier debugging
patch - observing apparent memory corruption with no apparent pattern
during save/restore/migrate, I finally decided to try a brute force method
and track all allocations. Since you are so eager to point out and fix all
order > 0 allocations, I was quite surprised to see one while tmem
itself initialized its state for Dom0. Hence I thought I'd point it out. The
patch as it stands is, I think, not really a general debugging aid - if you
think differently, I can of course still share it.

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features
  2010-02-18 16:18   ` Jan Beulich
@ 2010-02-18 17:32     ` Dan Magenheimer
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Magenheimer @ 2010-02-18 17:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Tim Deegan, GeorgeDunlap, PatrickColp, Ian Pratt,
	Andrew Peace, Keir Fraser, Grzegorz Milos

> Since you are so eager to point out and fix all order > 0 allocations

"eager" is a rather poor word choice :-)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@novell.com]
> Sent: Thursday, February 18, 2010 9:19 AM
> To: Dan Magenheimer
> Cc: Grzegorz Milos; Tim Deegan; PatrickColp; Andrew Peace;
> GeorgeDunlap; Ian Pratt; Keir Fraser; xen-devel@lists.xensource.com
> Subject: RE: Memory fragmentation, order>0 allocation, and 4.0 dynamic
> RAM optimization features
> 
> >>> Dan Magenheimer <dan.magenheimer@oracle.com> 18.02.10 17:09 >>>
> >But ignoring my flimsy excuses, Jan, do you have some debug code
> >you are using to identify order>0 allocations?  If so, could I
> >have a copy... and perhaps Keir would consider adding
> >it post-4.0 to make it easier to search-and-destroy.
> 
> I actually noticed this only as a side effect from a much uglier
> debugging
> patch - observing apparent memory corruption with no apparent pattern
> during save/restore/migrate, I finally decided to try a brute force
> method
> and track all allocations. Since you are so eager to point out and fix
> all
> order > 0 allocations, I was quite surprised to see one while tmem
> itself initialized its state for Dom0. Hence I thought I'd point it
> out. The
> patch as it stands is, I think, not really a general debugging aid - if
> you
> think differently, I can of course still share it.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-02-18 17:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-08 18:13 Memory fragmentation, order>0 allocation, and 4.0 dynamic RAM optimization features Dan Magenheimer
2010-02-08 19:11 ` Keir Fraser
2010-02-09 10:50   ` Tim Deegan
2010-02-09 13:13 ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2010-02-18  8:04 Jan Beulich
2010-02-18 16:09 ` Dan Magenheimer
2010-02-18 16:18   ` Jan Beulich
2010-02-18 17:32     ` Dan Magenheimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).