linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Rafael Aquini <aquini@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Rusty Russell <rusty@rustcorp.com.au>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mel@csn.ul.ie>,
	Andi Kleen <andi@firstfloor.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Minchan Kim <minchan@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
Date: Mon, 17 Sep 2012 15:15:43 -0700	[thread overview]
Message-ID: <20120917151543.fd523040.akpm@linux-foundation.org> (raw)
In-Reply-To: <89c9f4096bbad072e155445fcdf1805d47ddf48e.1347897793.git.aquini@redhat.com>

On Mon, 17 Sep 2012 13:38:16 -0300
Rafael Aquini <aquini@redhat.com> wrote:

> Memory fragmentation introduced by ballooning might reduce significantly
> the number of 2MB contiguous memory blocks that can be used within a guest,
> thus imposing performance penalties associated with the reduced number of
> transparent huge pages that could be used by the guest workload.
> 
> This patch introduces a common interface to help a balloon driver on
> making its page set movable to compaction, and thus allowing the system
> to better leverage the compation efforts on memory defragmentation.
> 
>
> ...
>
> +#ifndef _LINUX_BALLOON_COMPACTION_H
> +#define _LINUX_BALLOON_COMPACTION_H
> +
> +#include <linux/rcupdate.h>
> +#include <linux/pagemap.h>
> +#include <linux/gfp.h>
> +#include <linux/err.h>
> +
> +/* return code to identify when a ballooned page has been migrated */
> +#define BALLOON_MIGRATION_RETURN	0xba1100

I didn't really spend enough time to work out why this was done this
way, but I know a hack when I see one!

We forgot to document the a_ops.migratepage() return value.  Perhaps
it's time to work out what it should be.

> +#ifdef CONFIG_BALLOON_COMPACTION
> +#define count_balloon_event(e)		count_vm_event(e)
> +#define free_balloon_mapping(m)		kfree(m)

It would be better to write these in C please.  That way we get
typechecking, even when CONFIG_BALLOON_COMPACTION=n.

> +extern bool isolate_balloon_page(struct page *);
> +extern void putback_balloon_page(struct page *);
> +extern int migrate_balloon_page(struct page *newpage,
> +				struct page *page, enum migrate_mode mode);
> +extern struct address_space *alloc_balloon_mapping(void *balloon_device,
> +				const struct address_space_operations *a_ops);

There's a useful convention that interface identifiers are prefixed by
their interface's name.  IOW, everything in this file would start with
"balloon_".  balloon_page_isolate, balloon_page_putback, etc.  I think
we could follow that convention here?

> +static inline void assign_balloon_mapping(struct page *page,
> +					  struct address_space *mapping)
> +{
> +	page->mapping = mapping;
> +	smp_wmb();
> +}
> +
> +static inline void clear_balloon_mapping(struct page *page)
> +{
> +	page->mapping = NULL;
> +	smp_wmb();
> +}
> +
> +static inline gfp_t balloon_mapping_gfp_mask(void)
> +{
> +	return GFP_HIGHUSER_MOVABLE;
> +}
> +
> +static inline bool __is_movable_balloon_page(struct page *page)
> +{
> +	struct address_space *mapping = ACCESS_ONCE(page->mapping);
> +	smp_read_barrier_depends();
> +	return mapping_balloon(mapping);
> +}

hm.  Are these barrier tricks copied from somewhere else, or home-made?

> +/*
> + * movable_balloon_page - test page->mapping->flags to identify balloon pages
> + *			  that can be moved by compaction/migration.
> + *
> + * This function is used at core compaction's page isolation scheme and so it's
> + * exposed to several system pages which may, or may not, be part of a memory
> + * balloon, and thus we cannot afford to hold a page locked to perform tests.

I don't understand this.  What is a "system page"?  If I knew that, I
migth perhaps understand why we cannot lock such a page.

> + * Therefore, as we might return false positives in the case a balloon page
> + * is just released under us, the page->mapping->flags need to be retested
> + * with the proper page lock held, on the functions that will cope with the
> + * balloon page later.
> + */
> +static inline bool movable_balloon_page(struct page *page)
> +{
> +	/*
> +	 * Before dereferencing and testing mapping->flags, lets make sure
> +	 * this is not a page that uses ->mapping in a different way
> +	 */
> +	if (!PageSlab(page) && !PageSwapCache(page) && !PageAnon(page) &&
> +	    !page_mapped(page))
> +		return __is_movable_balloon_page(page);
> +
> +	return false;
> +}
> +
> +/*
> + * __page_balloon_device - get the balloon device that owns the given page.
> + *
> + * This shall only be used at driver callbacks under proper page lock,
> + * to get access to the balloon device which @page belongs.
> + */
> +static inline void *__page_balloon_device(struct page *page)
> +{
> +	struct address_space *mapping = page->mapping;
> +	if (mapping)
> +		mapping = mapping->assoc_mapping;
> +
> +	return mapping;
> +}

So you've repurposed address_space.assoc_mapping in new and unexpected
ways.

I don't immediately see a problem with doing this, but we should do it
properly.  Something like:

- rename address_space.assoc_mapping to private_data
- it has type void*
- document its ownership rules
- convert fs/buffer.c

all done as a standalone preparatory patch.

Also, your usage of ->private_data should minimise its use of void* -
use more specific types wherever possible.  So this function should
return a "struct virtio_balloon *".

It is unobvious why this interface function is prefixed with __.

> +/*
> + * DEFINE_BALLOON_MAPPING_AOPS - declare and instantiate a callback descriptor
> + *				 to be used as balloon page->mapping->a_ops.
> + *
> + * @label     : declaration identifier (var name)
> + * @isolatepg : callback symbol name for performing the page isolation step
> + * @migratepg : callback symbol name for performing the page migration step
> + * @putbackpg : callback symbol name for performing the page putback step
> + *
> + * address_space_operations utilized methods for ballooned pages:
> + *   .migratepage    - used to perform balloon's page migration (as is)
> + *   .invalidatepage - used to isolate a page from balloon's page list
> + *   .freepage       - used to reinsert an isolated page to balloon's page list
> + */
> +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> +	const struct address_space_operations (label) = {		    \
> +		.migratepage    = (migratepg),				    \
> +		.invalidatepage = (isolatepg),				    \
> +		.freepage       = (putbackpg),				    \
> +	}

erp.  Can we avoid doing this?  afaict it would be pretty simple to
avoid instantiating virtio_balloon_aops at all if
CONFIG_BALLOON_COMPACTION=n?

> +#else
> +#define assign_balloon_mapping(p, m)	do { } while (0)
> +#define clear_balloon_mapping(p)	do { } while (0)
> +#define free_balloon_mapping(m)		do { } while (0)
> +#define count_balloon_event(e)		do { } while (0)

Written in C with proper types if possible, please.

> +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> +	const struct {} (label) = {}
> +
> +static inline bool movable_balloon_page(struct page *page) { return false; }
> +static inline bool isolate_balloon_page(struct page *page) { return false; }
> +static inline void putback_balloon_page(struct page *page) { return; }
> +
> +static inline int migrate_balloon_page(struct page *newpage,
> +				struct page *page, enum migrate_mode mode)
> +{
> +	return 0;
> +}
> +
>
> ...
>
> @@ -53,6 +54,23 @@ static inline int mapping_unevictable(struct address_space *mapping)
>  	return !!mapping;
>  }
>  
> +static inline void mapping_set_balloon(struct address_space *mapping)
> +{
> +	set_bit(AS_BALLOON_MAP, &mapping->flags);
> +}
> +
> +static inline void mapping_clear_balloon(struct address_space *mapping)
> +{
> +	clear_bit(AS_BALLOON_MAP, &mapping->flags);
> +}
> +
> +static inline int mapping_balloon(struct address_space *mapping)
> +{
> +	if (mapping)
> +		return test_bit(AS_BALLOON_MAP, &mapping->flags);
> +	return !!mapping;

Why not "return 0"?

Or

	return mapping && test_bit(AS_BALLOON_MAP, &mapping->flags);

> +}
> +
>
> ...
>
> +struct address_space *alloc_balloon_mapping(void *balloon_device,
> +				const struct address_space_operations *a_ops)
> +{
> +	struct address_space *mapping;
> +
> +	mapping = kmalloc(sizeof(*mapping), GFP_KERNEL);
> +	if (!mapping)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * Give a clean 'zeroed' status to all elements of this special
> +	 * balloon page->mapping struct address_space instance.
> +	 */
> +	address_space_init_once(mapping);
> +
> +	/*
> +	 * Set mapping->flags appropriately, to allow balloon ->mapping
> +	 * identification, as well as give a proper hint to the balloon
> +	 * driver on what GFP allocation mask shall be used.
> +	 */
> +	mapping_set_balloon(mapping);
> +	mapping_set_gfp_mask(mapping, balloon_mapping_gfp_mask());
> +
> +	/* balloon's page->mapping->a_ops callback descriptor */
> +	mapping->a_ops = a_ops;
> +
> +	/*
> +	 * balloon special page->mapping overloads ->assoc_mapping
> +	 * to held a reference back to the balloon device wich 'owns'
> +	 * a given page. This is the way we can cope with multiple
> +	 * balloon devices without losing reference of several
> +	 * ballooned pagesets.

I don't really understand the final part of this comment.  Can you
expand more fully on the problem which this code is solving?

> +	 */
> +	mapping->assoc_mapping = balloon_device;
> +
> +	return mapping;
> +}
> +EXPORT_SYMBOL_GPL(alloc_balloon_mapping);

balloon_mapping_alloc() :)

> +static inline void __isolate_balloon_page(struct page *page)
> +{
> +	page->mapping->a_ops->invalidatepage(page, 0);
> +}
> +
>
> ...
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-09-17 22:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
2012-09-17 22:15   ` Andrew Morton [this message]
2012-09-18 16:24     ` Rafael Aquini
2012-09-18 22:09       ` Andrew Morton
2012-09-25  1:05       ` Michael S. Tsirkin
2012-09-25 14:00         ` Rafael Aquini
2012-09-24 12:44   ` Peter Zijlstra
2012-09-17 16:38 ` [PATCH v10 2/5] mm: introduce compaction and migration for ballooned pages Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
2012-09-17 22:15   ` Andrew Morton
2012-09-18 14:07     ` Rafael Aquini
2012-09-25  0:40   ` Michael S. Tsirkin
2012-09-25 18:07     ` Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 4/5] mm: introduce putback_movable_pages() Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 5/5] mm: add vm event counters for balloon pages compaction Rafael Aquini
2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
2012-09-17 22:45   ` Rik van Riel
2012-09-18  0:45   ` Rusty Russell
2012-09-25  1:17   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120917151543.fd523040.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=aquini@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan@kernel.org \
    --cc=mst@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).