xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Vasiliy G Tolstov <v.tolstov@selfip.ru>
To: Daniel Kiper <dkiper@net-space.pl>
Cc: xen-devel@lists.xensource.com
Subject: Re: [PATCH] GSoC 2010 - Memory hotplug support for Xen guests - third fully working version
Date: Thu, 12 Aug 2010 11:44:01 +0400	[thread overview]
Message-ID: <1281599041.5454.1.camel@vase.work> (raw)
In-Reply-To: <20100812012224.GA16479@router-fw-old.local.net-space.pl>

В Чтв, 12/08/2010 в 03:22 +0200, Daniel Kiper пишет:
> Hi,
> 
> Here is the third version of memory hotplug support
> for Xen guests patch. This one cleanly applies to
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git
> repository, xen/memory-hotplug head.
> 
> On Fri, Aug 06, 2010 at 04:03:18PM +0400, Vasiliy G Tolstov wrote:
> [...]
> > Testing on sles 11 sp1 and opensuse 11.3. On results - send e-mail..
> 
> Thx.
> 
> On Fri, Aug 06, 2010 at 12:34:08PM -0400, Konrad Rzeszutek Wilk wrote:
> [...]
> > > +static int allocate_additional_memory(unsigned long nr_pages)
> > > +{
> > > +	long rc;
> > > +	resource_size_t r_min, r_size;
> > > +	struct resource *r;
> > > +	struct xen_memory_reservation reservation = {
> > > +		.address_bits = 0,
> > > +		.extent_order = 0,
> > > +		.domid        = DOMID_SELF
> > > +	};
> > > +	unsigned long flags, i, pfn;
> > > +
> > > +	if (nr_pages > ARRAY_SIZE(frame_list))
> > > +		nr_pages = ARRAY_SIZE(frame_list);
> > > +
> > > +	spin_lock_irqsave(&balloon_lock, flags);
> > > +
> > > +	if (!is_memory_resource_reserved()) {
> > > +
> > > +		/*
> > > +		 * Look for first unused memory region starting at page
> > > +		 * boundary. Skip last memory section created at boot time
> > > +		 * becuase it may contains unused memory pages with PG_reserved
> > > +		 * bit not set (online_pages require PG_reserved bit set).
> > > +		 */
> > > +
> > > +		r = kzalloc(sizeof(struct resource), GFP_KERNEL);
> >
> > You are holding a spinlock here. Kzalloc can sleep
> 
> Thx. Fixed.
> 
> On Fri, Aug 06, 2010 at 10:42:48AM -0700, Jeremy Fitzhardinge wrote:
> > >   - PV on HVM mode is supported now; it was tested on
> > >     git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
> > >     repository, 2.6.34-pvhvm head,
> >
> > Good.  I noticed you have some specific tests for "xen_pv_domain()" -
> > are there many differences between pv and hvm?
> 
> No. Only those changes are needed where
> xen_domain()/xen_pv_domain() is used.
> 
> > >>>+#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
> > >>>+static inline unsigned long current_target(void)
> > >>>+{
> > >>>+	return balloon_stats.target_pages;
> > >>Why does this need its own version?
> > >Because original version return values not bigger
> > >then initial memory allocation which does not allow
> > >memory hotplug to function.
> >
> > But surely they can be combined?  A system without
> > XEN_BALLOON_MEMORY_HOTPLUG is identical to a system with
> > XEN_BALLOON_MEMORY_HOTPLUG which hasn't yet added any memory.  Some
> > variables may become constants (because memory can never be hot-added),
> > but the logic of the code should be the same.
> 
> Done.
> 
> > Overall, this looks much better.  The next step is to split this into at
> > least two patches: one for the core code, and one for the Xen bits.
> > Each patch should do just one logical operation, so if you have several
> > distinct changes to the core code, put them in separate patches.
> 
> I will do that if this patch will be accepted.
> 
> > >diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > >index 38434da..beb1aa7 100644
> > >--- a/arch/x86/Kconfig
> > >+++ b/arch/x86/Kconfig
> > >@@ -1273,7 +1273,7 @@ config ARCH_SELECT_MEMORY_MODEL
> > >  	depends on ARCH_SPARSEMEM_ENABLE
> > >
> > >  config ARCH_MEMORY_PROBE
> > >-	def_bool y
> > >+	def_bool X86_64&&  !XEN
> > >  	depends on MEMORY_HOTPLUG
> >
> > The trouble with making anything statically depend on Xen at config time
> > is that you lose it even if you're not running under Xen.  A pvops
> > kernel can run on bare hardware as well, and we don't want to lose
> > functionality (assume that CONFIG_XEN is always set, since distros do
> > always set it).
> >
> > Can you find a clean way to prevent/disable ARCH_MEMORY_PROBE at runtime
> > when in a Xen context?
> 
> There is no simple way to do that. It requiers to do some
> changes in drivers/base/memory.c code. I think it should
> be done as kernel boot option (on by default to not break
> things using this interface now). If it be useful for maintainers
> of mm/memory_hotplug.c and drivers/base/memory.c code then
> I could do that. Currently original arch/x86/Kconfig version
> is restored.
> 
> > >+/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
> > >*/
> > >+static int __ref xen_add_memory(int nid, u64 start, u64 size)
> >
> > Could this be __meminit too then?
> 
> Good question. I looked throught the code and could
> not find any simple explanation why mm/memory_hotplug.c
> authors used __ref instead __meminit. Could you (mm/memory_hotplug.c
> authors/maintainers) tell us why ???
> 
> > >+{
> > >+	pg_data_t *pgdat = NULL;
> > >+	int new_pgdat = 0, ret;
> > >+
> > >+	lock_system_sleep();
> >
> > What's this for?  I see all its other users are in the memory hotplug
> > code, but presumably they're concerned about a real S3 suspend.  Do we
> > care about that here?
> 
> Yes, because as I know S3 state is supported by Xen guests.
> 
> > Actually, this is nearly identical to mm/memory_hotplug.c:add_memory().
> > It looks to me like you should:
> >
> >    * pull the common core out into mm/memory_hotplug.c:__add_memory()
> >      (or a better name)
> >    * make add_memory() do its
> >      register_memory_resource()/firmware_map_add_hotplug() around that
> >      (assuming they're definitely unwanted in the Xen case)
> >    * make xen_add_memory() just call __add_memory() along with whatever
> >      else it needs (which is nothing?)
> >
> > That way you can export a high-level __add_memory function from
> > memory_hotplug.c rather than the two internal detail functions.
> 
> Done.
> 
> > >+		r->name = "System RAM";
> >
> > How about making it clear its Xen hotplug RAM?  Or do things care about
> > the "System RAM" name?
> 
> As I know no however as I saw anybody do not differentiate between
> normal and hotplugged memory. I thought about that ealier however
> stated that this soultion does not give us any real gain. That is why
> I decided to use standard name for hotplugged memory.
> 
> If you have a questions please drop me a line.
> 
> Daniel
> 
> Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
> ---
>  arch/x86/Kconfig               |    2 +-
>  drivers/xen/balloon.c          |   95 ++++++++-------------------------------
>  include/linux/memory_hotplug.h |    3 +-
>  mm/memory_hotplug.c            |   55 ++++++++++++++++-------
>  4 files changed, 61 insertions(+), 94 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index beb1aa7..9458685 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1273,7 +1273,7 @@ config ARCH_SELECT_MEMORY_MODEL
>  	depends on ARCH_SPARSEMEM_ENABLE
>  
>  config ARCH_MEMORY_PROBE
> -	def_bool X86_64 && !XEN
> +	def_bool X86_64
>  	depends on MEMORY_HOTPLUG
>  
>  config ILLEGAL_POINTER_VALUE
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 31edc26..5120075 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -193,63 +193,11 @@ static void balloon_alarm(unsigned long unused)
>  }
>  
>  #ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
> -static inline unsigned long current_target(void)
> -{
> -	return balloon_stats.target_pages;
> -}
> -
>  static inline u64 is_memory_resource_reserved(void)
>  {
>  	return balloon_stats.hotplug_start_paddr;
>  }
>  
> -/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> -static int __ref xen_add_memory(int nid, u64 start, u64 size)
> -{
> -	pg_data_t *pgdat = NULL;
> -	int new_pgdat = 0, ret;
> -
> -	lock_system_sleep();
> -
> -	if (!node_online(nid)) {
> -		pgdat = hotadd_new_pgdat(nid, start);
> -		ret = -ENOMEM;
> -		if (!pgdat)
> -			goto out;
> -		new_pgdat = 1;
> -	}
> -
> -	/* call arch's memory hotadd */
> -	ret = arch_add_memory(nid, start, size);
> -
> -	if (ret < 0)
> -		goto error;
> -
> -	/* we online node here. we can't roll back from here. */
> -	node_set_online(nid);
> -
> -	if (new_pgdat) {
> -		ret = register_one_node(nid);
> -		/*
> -		 * If sysfs file of new node can't create, cpu on the node
> -		 * can't be hot-added. There is no rollback way now.
> -		 * So, check by BUG_ON() to catch it reluctantly..
> -		 */
> -		BUG_ON(ret);
> -	}
> -
> -	goto out;
> -
> -error:
> -	/* rollback pgdat allocation */
> -	if (new_pgdat)
> -		rollback_node_hotadd(nid, pgdat);
> -
> -out:
> -	unlock_system_sleep();
> -	return ret;
> -}
> -
>  static int allocate_additional_memory(unsigned long nr_pages)
>  {
>  	long rc;
> @@ -265,8 +213,6 @@ static int allocate_additional_memory(unsigned long nr_pages)
>  	if (nr_pages > ARRAY_SIZE(frame_list))
>  		nr_pages = ARRAY_SIZE(frame_list);
>  
> -	spin_lock_irqsave(&balloon_lock, flags);
> -
>  	if (!is_memory_resource_reserved()) {
>  
>  		/*
> @@ -280,7 +226,7 @@ static int allocate_additional_memory(unsigned long nr_pages)
>  
>  		if (!r) {
>  			rc = -ENOMEM;
> -			goto out;
> +			goto out_0;
>  		}
>  
>  		r->name = "System RAM";
> @@ -293,12 +239,14 @@ static int allocate_additional_memory(unsigned long nr_pages)
>  
>  		if (rc < 0) {
>  			kfree(r);
> -			goto out;
> +			goto out_0;
>  		}
>  
>  		balloon_stats.hotplug_start_paddr = r->start;
>  	}
>  
> +	spin_lock_irqsave(&balloon_lock, flags);
> +
>  	pfn = PFN_DOWN(balloon_stats.hotplug_start_paddr + balloon_stats.hotplug_size);
>  
>  	for (i = 0; i < nr_pages; ++i, ++pfn)
> @@ -310,7 +258,7 @@ static int allocate_additional_memory(unsigned long nr_pages)
>  	rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
>  
>  	if (rc < 0)
> -		goto out;
> +		goto out_1;
>  
>  	pfn = PFN_DOWN(balloon_stats.hotplug_start_paddr + balloon_stats.hotplug_size);
>  
> @@ -323,9 +271,10 @@ static int allocate_additional_memory(unsigned long nr_pages)
>  	balloon_stats.hotplug_size += rc << PAGE_SHIFT;
>  	balloon_stats.current_pages += rc;
>  
> -out:
> +out_1:
>  	spin_unlock_irqrestore(&balloon_lock, flags);
>  
> +out_0:
>  	return rc < 0 ? rc : rc != nr_pages;
>  }
>  
> @@ -337,11 +286,11 @@ static void hotplug_allocated_memory(void)
>  
>  	nid = memory_add_physaddr_to_nid(balloon_stats.hotplug_start_paddr);
>  
> -	ret = xen_add_memory(nid, balloon_stats.hotplug_start_paddr,
> +	ret = add_registered_memory(nid, balloon_stats.hotplug_start_paddr,
>  						balloon_stats.hotplug_size);
>  
>  	if (ret) {
> -		pr_err("%s: xen_add_memory: Memory hotplug failed: %i\n",
> +		pr_err("%s: add_registered_memory: Memory hotplug failed: %i\n",
>  			__func__, ret);
>  		goto error;
>  	}
> @@ -388,18 +337,6 @@ out:
>  	balloon_stats.hotplug_size = 0;
>  }
>  #else
> -static unsigned long current_target(void)
> -{
> -	unsigned long target = balloon_stats.target_pages;
> -
> -	target = min(target,
> -		     balloon_stats.current_pages +
> -		     balloon_stats.balloon_low +
> -		     balloon_stats.balloon_high);
> -
> -	return target;
> -}
> -
>  static inline u64 is_memory_resource_reserved(void)
>  {
>  	return 0;
> @@ -407,13 +344,21 @@ static inline u64 is_memory_resource_reserved(void)
>  
>  static inline int allocate_additional_memory(unsigned long nr_pages)
>  {
> +	/*
> +	 * CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set.
> +	 * balloon_stats.target_pages could not be bigger
> +	 * than balloon_stats.current_pages because additional
> +	 * memory allocation is not possible.
> +	 */
> +	balloon_stats.target_pages = balloon_stats.current_pages;
> +
>  	return 0;
>  }
>  
>  static inline void hotplug_allocated_memory(void)
>  {
>  }
> -#endif
> +#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
>  
>  static int increase_reservation(unsigned long nr_pages)
>  {
> @@ -553,7 +498,7 @@ static void balloon_process(struct work_struct *work)
>  	mutex_lock(&balloon_mutex);
>  
>  	do {
> -		credit = current_target() - balloon_stats.current_pages;
> +		credit = balloon_stats.target_pages - balloon_stats.current_pages;
>  
>  		if (credit > 0) {
>  			if (balloon_stats.balloon_low || balloon_stats.balloon_high)
> @@ -572,7 +517,7 @@ static void balloon_process(struct work_struct *work)
>  	} while ((credit != 0) && !need_sleep);
>  
>  	/* Schedule more work if there is some still to be done. */
> -	if (current_target() != balloon_stats.current_pages)
> +	if (balloon_stats.target_pages != balloon_stats.current_pages)
>  		mod_timer(&balloon_timer, jiffies + HZ);
>  	else if (is_memory_resource_reserved())
>  		hotplug_allocated_memory();
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 6652eae..37f1894 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -202,8 +202,7 @@ static inline int is_mem_section_removable(unsigned long pfn,
>  }
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>  
> -extern pg_data_t *hotadd_new_pgdat(int nid, u64 start);
> -extern void rollback_node_hotadd(int nid, pg_data_t *pgdat);
> +extern int add_registered_memory(int nid, u64 start, u64 size);
>  extern int add_memory(int nid, u64 start, u64 size);
>  extern int arch_add_memory(int nid, u64 start, u64 size);
>  extern int remove_memory(u64 start, u64 size);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 143e03c..48a65bb 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -453,7 +453,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages)
>  #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
>  
>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> -pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
> +static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
>  {
>  	struct pglist_data *pgdat;
>  	unsigned long zones_size[MAX_NR_ZONES] = {0};
> @@ -473,32 +473,21 @@ pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
>  
>  	return pgdat;
>  }
> -EXPORT_SYMBOL_GPL(hotadd_new_pgdat);
>  
> -void rollback_node_hotadd(int nid, pg_data_t *pgdat)
> +static void rollback_node_hotadd(int nid, pg_data_t *pgdat)
>  {
>  	arch_refresh_nodedata(nid, NULL);
>  	arch_free_nodedata(pgdat);
>  	return;
>  }
> -EXPORT_SYMBOL_GPL(rollback_node_hotadd);
> -
>  
>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> -int __ref add_memory(int nid, u64 start, u64 size)
> +static int __ref __add_memory(int nid, u64 start, u64 size)
>  {
>  	pg_data_t *pgdat = NULL;
>  	int new_pgdat = 0;
> -	struct resource *res;
>  	int ret;
>  
> -	lock_system_sleep();
> -
> -	res = register_memory_resource(start, size);
> -	ret = -EEXIST;
> -	if (!res)
> -		goto out;
> -
>  	if (!node_online(nid)) {
>  		pgdat = hotadd_new_pgdat(nid, start);
>  		ret = -ENOMEM;
> @@ -535,11 +524,45 @@ error:
>  	/* rollback pgdat allocation and others */
>  	if (new_pgdat)
>  		rollback_node_hotadd(nid, pgdat);
> -	if (res)
> -		release_memory_resource(res);
>  
>  out:
> +	return ret;
> +}
> +
> +int __ref add_registered_memory(int nid, u64 start, u64 size)
> +{
> +	int ret;
> +
> +	lock_system_sleep();
> +	ret = __add_memory(nid, start, size);
>  	unlock_system_sleep();
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(add_registered_memory);
> +
> +int __ref add_memory(int nid, u64 start, u64 size)
> +{
> +	int ret = -EEXIST;
> +	struct resource *res;
> +
> +	lock_system_sleep();
> +
> +	res = register_memory_resource(start, size);
> +
> +	if (!res)
> +		goto out;
> +
> +	ret = __add_memory(nid, start, size);
> +
> +	if (!ret)
> +		goto out;
> +
> +	release_memory_resource(res);
> +
> +out:
> +	unlock_system_sleep();
> +
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(add_memory);
> 

Can You provide patch to sles 11 sp1 ?
I found that sles has some modification to the kernel and patch does not
apply cleanly =(.


-- 
Vasiliy G Tolstov <v.tolstov@selfip.ru>
Selfip.Ru

  reply	other threads:[~2010-08-12  7:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-12  1:22 [PATCH] GSoC 2010 - Memory hotplug support for Xen guests - third fully working version Daniel Kiper
2010-08-12  7:44 ` Vasiliy G Tolstov [this message]
2010-08-12 23:20   ` Daniel Kiper
2010-08-13  0:19     ` Pasi Kärkkäinen
2010-08-16  9:23       ` Daniel Kiper
2010-08-13  5:49     ` Vasiliy G Tolstov
2010-08-16  1:33       ` Daniel Kiper
2010-08-16  6:32         ` Vasiliy G Tolstov
2010-08-13  0:43 ` Jeremy Fitzhardinge
2010-08-16 15:44   ` Daniel Kiper
2010-08-25 21:33     ` Jeremy Fitzhardinge
2010-08-26  1:23       ` KAMEZAWA Hiroyuki
2010-08-13  0:46 ` Jeremy Fitzhardinge
2010-08-13  0:48   ` [Xen-devel] " Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1281599041.5454.1.camel@vase.work \
    --to=v.tolstov@selfip.ru \
    --cc=dkiper@net-space.pl \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).