public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Yasunori Goto <y-goto@jp.fujitsu.com>,
	Christoph Lameter <clameter@sgi.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [PATCH RFC] hotplug-memory: refactor online_pages to separate zone growth from page onlining
Date: Sat, 29 Mar 2008 16:53:39 -0700	[thread overview]
Message-ID: <47EED683.5030200@goop.org> (raw)
In-Reply-To: <1206806774.31896.27.camel@nimitz.home.sr71.net>

Dave Hansen wrote:
> On Fri, 2008-03-28 at 19:08 -0700, Jeremy Fitzhardinge wrote:
>   
>> My big remaining problem is how to disable the sysfs interface for this 
>> memory.  I need to prevent any onlining via /sys/device/system/memory.
>>     
>
> I've been thinking about this some more, and I wish that you wouldn't
> just throw this interface away or completely disable it.

I had no intention of globally disabling it.  I just need to disable it 
for my use case.

>   It actually
> does *exactly* what you want in a way. :)
>
> When the /memoryXX/ directory appears, that means that the hardware has
> found the memory, and that the 'struct page' is allocated and ready to
> be initialized.
>
> When the OS actually wants to use the memory (initialize the 'struct
> page', and free_page() it), it does the 'echo online > /sys...'.  Both
> the 'struct page' and the memory represented by it are untouched until
> the "online".  This was originally in place to avoid fragmenting it
> immediately in the case that the system did not need it.
>
> To me, it sounds like the only different thing that you want is to make
> sure that only partial sections are onlined.  So, shall we work with the
> existing interfaces to online partial sections, or will we just disable
> it entirely when we see Xen?
>   

Well, yes and no.

For the current balloon driver, it doesn't make much sense.  It would 
add a fair amount of complexity without any real gain.  It's currently 
based around alloc_page/free_page.  When it wants to shrink the domain 
and give memory back to the host, it allocates pages, adds the page 
structures to a ballooned pages list, and strips off the backing memory 
and gives it to the host.  Growing the domain is the converse: it gets 
pages from the host, pulls page structures off the list, binds them 
together and frees them back to the kernel.  If it runs out of ballooned 
page structures, it hotplugs in some memory to add more.

That said, if (partial-)sections were much smaller - say 2-4 meg - and 
page migration/defrag worked reliably, then we could probably do without 
the balloon driver and do it all in terms of memory hot plug/unplug.  
That would give us a general mechanism which could either be driven from 
userspace, and/or have in-kernel Xen/kvm/s390/etc policy modules.  Aside 
from small sections, the only additional requirement would be an online 
hook which can actually attach backing memory to the pages being 
onlined, rather than just assuming an underlying DIMM as current code does.

> For Xen and KVM, how does it get decided that the guest needs more
> memory?  Is this guest or host driven?  Both?  How is the guest
> notified?  Is guest userspace involved at all?

In Xen, either the host or the guest can set the target size for the 
domain, which is capped by the host-set limit.  Aside from possibly 
setting the target size, there's no usermode involvement in managing 
ballooning.  The virtio balloon driver is similar, though from a quick 
look it seems to be entirely driven by the host side.

    J

  reply	other threads:[~2008-03-29 23:54 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-29  0:00 [PATCH RFC] hotplug-memory: refactor online_pages to separate zone growth from page onlining Jeremy Fitzhardinge
2008-03-29  0:47 ` Dave Hansen
2008-03-29  2:08   ` Jeremy Fitzhardinge
2008-03-29  6:01     ` Dave Hansen
2008-03-29 16:06     ` Dave Hansen
2008-03-29 23:53       ` Jeremy Fitzhardinge [this message]
2008-03-30  0:26         ` Anthony Liguori
2008-03-31 16:42         ` Dave Hansen
2008-03-31 18:06           ` Jeremy Fitzhardinge
2008-04-01  7:17             ` Yasunori Goto
2008-04-02 18:46             ` Dave Hansen
2008-04-02 18:52               ` Jeremy Fitzhardinge
2008-04-02 18:59                 ` Dave Hansen
2008-04-02 21:03                   ` Jeremy Fitzhardinge
2008-04-02 21:17                     ` Dave Hansen
2008-04-02 21:35                       ` Jeremy Fitzhardinge
2008-04-02 21:43                         ` Dave Hansen
2008-04-02 22:13                           ` Jeremy Fitzhardinge
2008-04-02 23:27                             ` Dave Hansen
2008-04-03  7:03                             ` KAMEZAWA Hiroyuki
2008-04-02 21:36                       ` Anthony Liguori
2008-03-29  4:38 ` KAMEZAWA Hiroyuki
2008-03-29  5:48   ` Jeremy Fitzhardinge
2008-03-29  6:26     ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47EED683.5030200@goop.org \
    --to=jeremy@goop.org \
    --cc=anthony@codemonkey.ws \
    --cc=clameter@sgi.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox