From: Dave Hansen <dave@linux.vnet.ibm.com>
To: Daniel Kiper <dkiper@net-space.pl>
Cc: ian.campbell@citrix.com, akpm@linux-foundation.org,
andi.kleen@intel.com, haicheng.li@linux.intel.com,
fengguang.wu@intel.com, jeremy@goop.org, konrad.wilk@oracle.com,
dan.magenheimer@oracle.com, v.tolstov@selfip.ru, pasik@iki.fi,
wdauchy@gmail.com, rientjes@google.com,
xen-devel@lists.xensource.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH R4 6/7] mm: Extend memory hotplug API to allow memory hotplug in virtual guests
Date: Tue, 08 Mar 2011 15:51:12 -0800 [thread overview]
Message-ID: <1299628272.9014.3465.camel@nimitz> (raw)
In-Reply-To: <20110308215003.GG27331@router-fw-old.local.net-space.pl>
On Tue, 2011-03-08 at 22:50 +0100, Daniel Kiper wrote:
> This patch extends memory hotplug API to allow easy memory hotplug
> in virtual guests. It contains:
> - generic section aligment macro,
> - online_page_chain and apropriate functions for registering/unregistering
> online page notifiers,
> - add_virtual_memory(u64 *size) function which adds memory region
> of size >= *size above max_pfn; new region is section aligned
> and size is modified to be multiple of section size.
Usually, when you can list stuff out like this, it's a good sign that
they belong in separate patches. I think it's true here as well.
But, these are looking a lot better. It looks like much less code, and
it's quite a bit simpler.
> +/*
> + * online_page_chain contains chain of notifiers called when page is onlined.
> + * When kernel is booting native_online_page_notifier() is registered with
> + * priority 0 as default notifier. Custom notifier should be registered with
> + * pririty > 0. It could be terminal (it should return NOTIFY_STOP on success)
"pririty"?
> + * or not (it should return NOTIFY_DONE or NOTIFY_OK on success; for full list
> + * of return codes look into include/linux/notifier.h).
> + *
> + * Working example of usage: drivers/xen/balloon.c
> + */
> +
> +static RAW_NOTIFIER_HEAD(online_page_chain);
> +
> DEFINE_MUTEX(mem_hotplug_mutex);
>
> void lock_memory_hotplug(void)
> @@ -361,8 +375,33 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
> }
> EXPORT_SYMBOL_GPL(__remove_pages);
>
> -void online_page(struct page *page)
> +int register_online_page_notifier(struct notifier_block *nb)
> +{
> + int rc;
> +
> + lock_memory_hotplug();
> + rc = raw_notifier_chain_register(&online_page_chain, nb);
> + unlock_memory_hotplug();
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(register_online_page_notifier);
> +
> +int unregister_online_page_notifier(struct notifier_block *nb)
> +{
> + int rc;
> +
> + lock_memory_hotplug();
> + rc = raw_notifier_chain_unregister(&online_page_chain, nb);
> + unlock_memory_hotplug();
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(unregister_online_page_notifier);
The whole "native" thing really is Xen terminology. Could we call this
"generic_online_page_notifier()" perhaps? This really isn't even
"native" either since some hypervisors actually do use this code.
> +static int native_online_page_notifier(struct notifier_block *nb, unsigned long val, void *v)
> {
> + struct page *page = v;
> unsigned long pfn = page_to_pfn(page);
>
> totalram_pages++;
> @@ -375,12 +414,30 @@ void online_page(struct page *page)
> #endif
>
> #ifdef CONFIG_FLATMEM
> - max_mapnr = max(page_to_pfn(page), max_mapnr);
> + max_mapnr = max(pfn, max_mapnr);
> #endif
This is another tidbit that's probably good to do, but it's superfluous
here.
> ClearPageReserved(page);
> init_page_count(page);
> __free_page(page);
> +
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block native_online_page_nb = {
> + .notifier_call = native_online_page_notifier,
> + .priority = 0
> +};
That comment about priority really belongs here.
/*
* 0 priority makes this the fallthrough default. All
* architectures wanting to override this should set a
* higher priority and return NOTIFY_STOP to keep this
* from running.
*/
> +static int __init init_online_page_chain(void)
> +{
> + return register_online_page_notifier(&native_online_page_nb);
> +}
> +pure_initcall(init_online_page_chain);
> +
> +static void online_page(struct page *page)
> +{
> + raw_notifier_call_chain(&online_page_chain, 0, page);
> }
>
> static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> @@ -591,6 +648,36 @@ out:
> }
> EXPORT_SYMBOL_GPL(add_memory);
>
> +/*
> + * add_virtual_memory() adds memory region of size >= *size above max_pfn.
> + * New region is section aligned and size is modified to be multiple of
> + * section size.
Aligned up or down? Why did you choose up or down?
> Those features allow optimal use of address space and
> + * establish proper aligment when this function is called first time after
^^^^^^^^ alignment?
> + * boot (last section not fully populated at boot time may contains unused
> + * memory pages with PG_reserved bit not set; online_pages() does not allow
> + * page onlining in whole section if first page does not have PG_reserved
> + * bit set). Real size of added memory should be established at page onlining
> + * stage.
> + *
> + * This function is often used in virtual guests because mainly they do not
> + * care about new memory region address.
Remember, you're touching generic memory hotplug code here. I really
don't know what a "virtual guest" is or how it relates to this code.
How about something like this?
This code is expected to be used in cases where a certain amount
of memory needs to get added, but when the hardware or
hypervisor does not dictate where it will be placed.
> + * Working example of usage: drivers/xen/balloon.c
Please pull this out. It'll probably become stale before anyone uses
it. I trust people to know how to use cscope. :)
> +int add_virtual_memory(u64 *size)
> +{
> + int nid;
> + u64 start;
> +
> + start = PFN_PHYS(SECTION_ALIGN(max_pfn));
> + *size = (((*size >> PAGE_SHIFT) & PAGE_SECTION_MASK) + PAGES_PER_SECTION) << PAGE_SHIFT;
Why use PFN_PHYS() in one case but not the other?
I'd also highly suggest using the ALIGN() macro in cases like this. It
makes it much more readable:
*size = PFN_PHYS(ALIGN(*size, SECTION_SIZE)));
> + nid = memory_add_physaddr_to_nid(start);
> +
> + return add_memory(nid, start, *size);
> +}
Could you talk a little bit more about how 'size' gets used? Also, are
we sure we want an interface where we're so liberal with 'size'? It
seems like requiring that it be section-aligned is a fair burden to
place on the caller. That way, we're not in a position of _guessing_
what the caller wants (aligning up or down).
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-03-08 23:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-08 21:50 [PATCH R4 6/7] mm: Extend memory hotplug API to allow memory hotplug in virtual guests Daniel Kiper
2011-03-08 23:51 ` Dave Hansen [this message]
2011-03-10 8:51 ` Daniel Kiper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1299628272.9014.3465.camel@nimitz \
--to=dave@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andi.kleen@intel.com \
--cc=dan.magenheimer@oracle.com \
--cc=dkiper@net-space.pl \
--cc=fengguang.wu@intel.com \
--cc=haicheng.li@linux.intel.com \
--cc=ian.campbell@citrix.com \
--cc=jeremy@goop.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasik@iki.fi \
--cc=rientjes@google.com \
--cc=v.tolstov@selfip.ru \
--cc=wdauchy@gmail.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).