Re: [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ric Mason <ric.masonn@gmail.com>
To: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: dan.magenheimer@oracle.com, konrad.wilk@oracle.com,
	sjenning@linux.vnet.ibm.com, gregkh@linuxfoundation.org,
	akpm@linux-foundation.org, ngupta@vflare.org,
	rcj@linux.vnet.ibm.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org,
	Stefan Hengelein <ilendir@googlemail.com>,
	Florian Schmaus <fschmaus@gmail.com>,
	Andor Daam <andor.daam@googlemail.com>
Subject: Re: [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
Date: Sun, 03 Feb 2013 02:06:30 -0600	[thread overview]
Message-ID: <1359878790.1328.9.camel@kernel.cn.ibm.com> (raw)
In-Reply-To: <1359750184-23408-2-git-send-email-konrad.wilk@oracle.com>

Hi Konrad,
On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,

Which boot parameter? I can't find it in
Documentation/kernl-parameters.txt

> this patch provides "lazy initialization", allowing backends to register to
> cleancache even after filesystems were mounted. Calls to init_fs and
> init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> On backend registration the fake poolids are mapped to real poolids and
> respective tmem_pools.
> 
> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
> [v1: Minor fixes: used #define for some values and bools]
> [v2: Removed CLEANCACHE_HAS_LAZY_INIT]
> [v3: Added more comments, added a lock for [shared_|]fs_poolid_map]
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  mm/cleancache.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 219 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/cleancache.c b/mm/cleancache.c
> index 32e6f41..e4dc314 100644
> --- a/mm/cleancache.c
> +++ b/mm/cleancache.c
> @@ -45,15 +45,99 @@ static u64 cleancache_puts;
>  static u64 cleancache_invalidates;
>  
>  /*
> - * register operations for cleancache, returning previous thus allowing
> - * detection of multiple backends and possible nesting
> + * When no backend is registered all calls to init_fs and init_shared_fs
> + * are registered and fake poolids (FAKE_FS_POOLID_OFFSET or
> + * FAKE_SHARED_FS_POOLID_OFFSET, plus offset in the respective array
> + * [shared_|]fs_poolid_map) are given to the respective super block
> + * (sb->cleancache_poolid) and no tmem_pools are created. When a backend
> + * registers with cleancache the previous calls to init_fs and init_shared_fs
> + * are executed to create tmem_pools and set the respective poolids. While no
> + * backend is registered all "puts", "gets" and "flushes" are ignored or failed.
> + */
> +#define MAX_INITIALIZABLE_FS 32
> +#define FAKE_FS_POOLID_OFFSET 1000
> +#define FAKE_SHARED_FS_POOLID_OFFSET 2000
> +
> +#define FS_NO_BACKEND (-1)
> +#define FS_UNKNOWN (-2)
> +static int fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static char *uuids[MAX_INITIALIZABLE_FS];
> +/*
> + * Mutex for the [shared_|]fs_poolid_map to guard against multiple threads
> + * invoking umount (and ending in __cleancache_invalidate_fs) and also multiple
> + * threads calling mount (and ending up in __cleancache_init_[shared|]fs).
> + */
> +static DEFINE_MUTEX(poolid_mutex);
> +/*
> + * When set to false (default) all calls to the cleancache functions, except
> + * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
> + * by the if (!backend_registered) return. This means multiple threads (from
> + * different filesystems) will be checking backend_registered. The usage of a
> + * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
> + * OK if the time between the backend's have been initialized (and
> + * backend_registered has been set to true) and when the filesystems start
> + * actually calling the backends. The inverse (when unloading) is obviously
> + * not good - but this shim does not do that (yet).
> + */
> +static bool backend_registered __read_mostly;
> +
> +/*
> + * The backends and filesystems work all asynchronously. This is b/c the

What's the meaning of b/c?zx

> + * backends can be built as modules.
> + * The usual sequence of events is:
> + * 	a) mount /	-> __cleancache_init_fs is called. We set the
> + * 		[shared_|]fs_poolid_map and uuids for.
> + *
> + * 	b). user does I/Os -> we call the rest of __cleancache_* functions
> + * 		which return immediately as backend_registered is false.
> + *
> + * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
> + * 		and set backend_registered to true, and for any fs_poolid_map
> + * 		(which is set by __cleancache_init_fs) we initialize the poolid.
> + *
> + * 	d). user does I/Os -> now that backend_registered is true all the
> + * 		__cleancache_* functions can call the backend. They all check
> + * 		that fs_poolid_map is valid and if so invoke the backend.
> + *
> + * 	e). umount /	-> __cleancache_invalidate_fs, the fs_poolid_map is
> + * 		reset (which is the second check in the __cleancache_* ops
> + * 		to call the backend).
> + *
> + * The sequence of event could also be c), followed by a), and d). and e). The
> + * c) would not happen anymore. There is also the chance of c), and one thread
> + * doing a) + d), and another doing e). For that case we depend on the
> + * filesystem calling __cleancache_invalidate_fs in the proper sequence (so
> + * that it handles all I/Os before it invalidates the fs (which is last part
> + * of unmounting process).
> + *
> + * Note: The acute reader will notice that there is no "rmmod zcache" case.
> + * This is b/c the functionality for that is not yet implemented and when
> + * done, will require some extra locking not yet devised.
> + */
> +
> +/*
> + * Register operations for cleancache, returning previous thus allowing
> + * detection of multiple backends and possible nesting.
>   */
>  struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
>  {
>  	struct cleancache_ops old = cleancache_ops;
> +	int i;
>  
> +	mutex_lock(&poolid_mutex);
>  	cleancache_ops = *ops;
> -	cleancache_enabled = 1;
> +
> +	backend_registered = true;
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_NO_BACKEND)
> +			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
> +			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +					(uuids[i], PAGE_SIZE);
> +	}
> +out:
> +	mutex_unlock(&poolid_mutex);
>  	return old;
>  }
>  EXPORT_SYMBOL(cleancache_register_ops);
> @@ -61,15 +145,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
>  /* Called by a cleancache-enabled filesystem at time of mount */
>  void __cleancache_init_fs(struct super_block *sb)
>  {
> -	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
> +			if (backend_registered)
> +				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +			else
> +				fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_fs);
>  
>  /* Called by a cleancache-enabled clustered filesystem at time of mount */
>  void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
>  {
> -	sb->cleancache_poolid =
> -		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
> +			uuids[i] = uuid;
> +			if (backend_registered)
> +				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +						(uuid, PAGE_SIZE);
> +			else
> +				shared_fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_shared_fs);
>  
> @@ -99,27 +210,53 @@ static int cleancache_get_key(struct inode *inode,
>  }
>  
>  /*
> + * Returns a pool_id that is associated with a given fake poolid.
> + */
> +static int get_poolid_from_fake(int fake_pool_id)
> +{
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
> +		return shared_fs_poolid_map[fake_pool_id -
> +			FAKE_SHARED_FS_POOLID_OFFSET];
> +	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
> +		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
> +	return FS_NO_BACKEND;
> +}
> +
> +/*
>   * "Get" data from cleancache associated with the poolid/inode/index
>   * that were specified when the data was put to cleanache and, if
>   * successful, use it to fill the specified page with data and return 0.
>   * The pageframe is unchanged and returns -1 if the get fails.
>   * Page must be locked by caller.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  int __cleancache_get_page(struct page *page)
>  {
>  	int ret = -1;
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_failed_gets++;
> +		goto out;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> -	if (pool_id < 0)
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
>  		goto out;
> +	pool_id = get_poolid_from_fake(fake_pool_id);
>  
>  	if (cleancache_get_key(page->mapping->host, &key) < 0)
>  		goto out;
>  
> -	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
> +	if (pool_id >= 0)
> +		ret = (*cleancache_ops.get_page)(pool_id,
> +				key, page->index, page);
>  	if (ret == 0)
>  		cleancache_succ_gets++;
>  	else
> @@ -134,16 +271,31 @@ EXPORT_SYMBOL(__cleancache_get_page);
>   * (previously-obtained per-filesystem) poolid and the page's,
>   * inode and page index.  Page must be locked.  Note that a put_page
>   * always "succeeds", though a subsequent get_page may succeed or fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_put_page(struct page *page)
>  {
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_puts++;
> +		return;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 &&
> -	      cleancache_get_key(page->mapping->host, &key) >= 0) {
> +		cleancache_get_key(page->mapping->host, &key) >= 0) {
>  		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
>  		cleancache_puts++;
>  	}
> @@ -153,19 +305,31 @@ EXPORT_SYMBOL(__cleancache_put_page);
>  /*
>   * Invalidate any data from cleancache associated with the poolid and the
>   * page's inode and page index so that a subsequent "get" will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_page(struct address_space *mapping,
>  					struct page *page)
>  {
>  	/* careful... page->mapping is NULL sometimes when this is called */
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> -	if (pool_id >= 0) {
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id >= 0) {
> +		pool_id = get_poolid_from_fake(fake_pool_id);
> +		if (pool_id < 0)
> +			return;
> +
>  		VM_BUG_ON(!PageLocked(page));
>  		if (cleancache_get_key(mapping->host, &key) >= 0) {
>  			(*cleancache_ops.invalidate_page)(pool_id,
> -							  key, page->index);
> +					key, page->index);
>  			cleancache_invalidates++;
>  		}
>  	}
> @@ -176,12 +340,25 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
>   * Invalidate all data from cleancache associated with the poolid and the
>   * mappings's inode so that all subsequent gets to this poolid/inode
>   * will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_inode(struct address_space *mapping)
>  {
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
>  		(*cleancache_ops.invalidate_inode)(pool_id, key);
>  }
> @@ -189,21 +366,37 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
>  
>  /*
>   * Called by any cleancache-enabled filesystem at time of unmount;
> - * note that pool_id is surrendered and may be reutrned by a subsequent
> - * cleancache_init_fs or cleancache_init_shared_fs
> + * note that pool_id is surrendered and may be returned by a subsequent
> + * cleancache_init_fs or cleancache_init_shared_fs.
>   */
>  void __cleancache_invalidate_fs(struct super_block *sb)
>  {
> -	if (sb->cleancache_poolid >= 0) {
> -		int old_poolid = sb->cleancache_poolid;
> -		sb->cleancache_poolid = -1;
> -		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	int index;
> +	int fake_pool_id = sb->cleancache_poolid;
> +	int old_poolid = fake_pool_id;
> +
> +	mutex_lock(&poolid_mutex);
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
> +		old_poolid = shared_fs_poolid_map[index];
> +		shared_fs_poolid_map[index] = FS_UNKNOWN;
> +		uuids[index] = NULL;
> +	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
> +		old_poolid = fs_poolid_map[index];
> +		fs_poolid_map[index] = FS_UNKNOWN;
>  	}
> +	sb->cleancache_poolid = -1;
> +	if (backend_registered)
> +		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_invalidate_fs);
>  
>  static int __init init_cleancache(void)
>  {
> +	int i;
> +
>  #ifdef CONFIG_DEBUG_FS
>  	struct dentry *root = debugfs_create_dir("cleancache", NULL);
>  	if (root == NULL)
> @@ -215,6 +408,11 @@ static int __init init_cleancache(void)
>  	debugfs_create_u64("invalidates", S_IRUGO,
>  				root, &cleancache_invalidates);
>  #endif
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		fs_poolid_map[i] = FS_UNKNOWN;
> +		shared_fs_poolid_map[i] = FS_UNKNOWN;
> +	}
> +	cleancache_enabled = 1;
>  	return 0;
>  }
>  module_init(init_cleancache)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Ric Mason <ric.masonn@gmail.com>
To: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: dan.magenheimer@oracle.com, konrad.wilk@oracle.com,
	sjenning@linux.vnet.ibm.com, gregkh@linuxfoundation.org,
	akpm@linux-foundation.org, ngupta@vflare.org,
	rcj@linux.vnet.ibm.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org,
	Stefan Hengelein <ilendir@googlemail.com>,
	Florian Schmaus <fschmaus@gmail.com>,
	Andor Daam <andor.daam@googlemail.com>
Subject: Re: [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
Date: Sun, 03 Feb 2013 02:06:30 -0600	[thread overview]
Message-ID: <1359878790.1328.9.camel@kernel.cn.ibm.com> (raw)
In-Reply-To: <1359750184-23408-2-git-send-email-konrad.wilk@oracle.com>

Hi Konrad,
On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,

Which boot parameter? I can't find it in
Documentation/kernl-parameters.txt

> this patch provides "lazy initialization", allowing backends to register to
> cleancache even after filesystems were mounted. Calls to init_fs and
> init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> On backend registration the fake poolids are mapped to real poolids and
> respective tmem_pools.
> 
> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
> [v1: Minor fixes: used #define for some values and bools]
> [v2: Removed CLEANCACHE_HAS_LAZY_INIT]
> [v3: Added more comments, added a lock for [shared_|]fs_poolid_map]
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  mm/cleancache.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 219 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/cleancache.c b/mm/cleancache.c
> index 32e6f41..e4dc314 100644
> --- a/mm/cleancache.c
> +++ b/mm/cleancache.c
> @@ -45,15 +45,99 @@ static u64 cleancache_puts;
>  static u64 cleancache_invalidates;
>  
>  /*
> - * register operations for cleancache, returning previous thus allowing
> - * detection of multiple backends and possible nesting
> + * When no backend is registered all calls to init_fs and init_shared_fs
> + * are registered and fake poolids (FAKE_FS_POOLID_OFFSET or
> + * FAKE_SHARED_FS_POOLID_OFFSET, plus offset in the respective array
> + * [shared_|]fs_poolid_map) are given to the respective super block
> + * (sb->cleancache_poolid) and no tmem_pools are created. When a backend
> + * registers with cleancache the previous calls to init_fs and init_shared_fs
> + * are executed to create tmem_pools and set the respective poolids. While no
> + * backend is registered all "puts", "gets" and "flushes" are ignored or failed.
> + */
> +#define MAX_INITIALIZABLE_FS 32
> +#define FAKE_FS_POOLID_OFFSET 1000
> +#define FAKE_SHARED_FS_POOLID_OFFSET 2000
> +
> +#define FS_NO_BACKEND (-1)
> +#define FS_UNKNOWN (-2)
> +static int fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static char *uuids[MAX_INITIALIZABLE_FS];
> +/*
> + * Mutex for the [shared_|]fs_poolid_map to guard against multiple threads
> + * invoking umount (and ending in __cleancache_invalidate_fs) and also multiple
> + * threads calling mount (and ending up in __cleancache_init_[shared|]fs).
> + */
> +static DEFINE_MUTEX(poolid_mutex);
> +/*
> + * When set to false (default) all calls to the cleancache functions, except
> + * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
> + * by the if (!backend_registered) return. This means multiple threads (from
> + * different filesystems) will be checking backend_registered. The usage of a
> + * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
> + * OK if the time between the backend's have been initialized (and
> + * backend_registered has been set to true) and when the filesystems start
> + * actually calling the backends. The inverse (when unloading) is obviously
> + * not good - but this shim does not do that (yet).
> + */
> +static bool backend_registered __read_mostly;
> +
> +/*
> + * The backends and filesystems work all asynchronously. This is b/c the

What's the meaning of b/c?zx

> + * backends can be built as modules.
> + * The usual sequence of events is:
> + * 	a) mount /	-> __cleancache_init_fs is called. We set the
> + * 		[shared_|]fs_poolid_map and uuids for.
> + *
> + * 	b). user does I/Os -> we call the rest of __cleancache_* functions
> + * 		which return immediately as backend_registered is false.
> + *
> + * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
> + * 		and set backend_registered to true, and for any fs_poolid_map
> + * 		(which is set by __cleancache_init_fs) we initialize the poolid.
> + *
> + * 	d). user does I/Os -> now that backend_registered is true all the
> + * 		__cleancache_* functions can call the backend. They all check
> + * 		that fs_poolid_map is valid and if so invoke the backend.
> + *
> + * 	e). umount /	-> __cleancache_invalidate_fs, the fs_poolid_map is
> + * 		reset (which is the second check in the __cleancache_* ops
> + * 		to call the backend).
> + *
> + * The sequence of event could also be c), followed by a), and d). and e). The
> + * c) would not happen anymore. There is also the chance of c), and one thread
> + * doing a) + d), and another doing e). For that case we depend on the
> + * filesystem calling __cleancache_invalidate_fs in the proper sequence (so
> + * that it handles all I/Os before it invalidates the fs (which is last part
> + * of unmounting process).
> + *
> + * Note: The acute reader will notice that there is no "rmmod zcache" case.
> + * This is b/c the functionality for that is not yet implemented and when
> + * done, will require some extra locking not yet devised.
> + */
> +
> +/*
> + * Register operations for cleancache, returning previous thus allowing
> + * detection of multiple backends and possible nesting.
>   */
>  struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
>  {
>  	struct cleancache_ops old = cleancache_ops;
> +	int i;
>  
> +	mutex_lock(&poolid_mutex);
>  	cleancache_ops = *ops;
> -	cleancache_enabled = 1;
> +
> +	backend_registered = true;
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_NO_BACKEND)
> +			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
> +			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +					(uuids[i], PAGE_SIZE);
> +	}
> +out:
> +	mutex_unlock(&poolid_mutex);
>  	return old;
>  }
>  EXPORT_SYMBOL(cleancache_register_ops);
> @@ -61,15 +145,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
>  /* Called by a cleancache-enabled filesystem at time of mount */
>  void __cleancache_init_fs(struct super_block *sb)
>  {
> -	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
> +			if (backend_registered)
> +				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +			else
> +				fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_fs);
>  
>  /* Called by a cleancache-enabled clustered filesystem at time of mount */
>  void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
>  {
> -	sb->cleancache_poolid =
> -		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
> +			uuids[i] = uuid;
> +			if (backend_registered)
> +				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +						(uuid, PAGE_SIZE);
> +			else
> +				shared_fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_shared_fs);
>  
> @@ -99,27 +210,53 @@ static int cleancache_get_key(struct inode *inode,
>  }
>  
>  /*
> + * Returns a pool_id that is associated with a given fake poolid.
> + */
> +static int get_poolid_from_fake(int fake_pool_id)
> +{
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
> +		return shared_fs_poolid_map[fake_pool_id -
> +			FAKE_SHARED_FS_POOLID_OFFSET];
> +	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
> +		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
> +	return FS_NO_BACKEND;
> +}
> +
> +/*
>   * "Get" data from cleancache associated with the poolid/inode/index
>   * that were specified when the data was put to cleanache and, if
>   * successful, use it to fill the specified page with data and return 0.
>   * The pageframe is unchanged and returns -1 if the get fails.
>   * Page must be locked by caller.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  int __cleancache_get_page(struct page *page)
>  {
>  	int ret = -1;
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_failed_gets++;
> +		goto out;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> -	if (pool_id < 0)
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
>  		goto out;
> +	pool_id = get_poolid_from_fake(fake_pool_id);
>  
>  	if (cleancache_get_key(page->mapping->host, &key) < 0)
>  		goto out;
>  
> -	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
> +	if (pool_id >= 0)
> +		ret = (*cleancache_ops.get_page)(pool_id,
> +				key, page->index, page);
>  	if (ret == 0)
>  		cleancache_succ_gets++;
>  	else
> @@ -134,16 +271,31 @@ EXPORT_SYMBOL(__cleancache_get_page);
>   * (previously-obtained per-filesystem) poolid and the page's,
>   * inode and page index.  Page must be locked.  Note that a put_page
>   * always "succeeds", though a subsequent get_page may succeed or fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_put_page(struct page *page)
>  {
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_puts++;
> +		return;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 &&
> -	      cleancache_get_key(page->mapping->host, &key) >= 0) {
> +		cleancache_get_key(page->mapping->host, &key) >= 0) {
>  		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
>  		cleancache_puts++;
>  	}
> @@ -153,19 +305,31 @@ EXPORT_SYMBOL(__cleancache_put_page);
>  /*
>   * Invalidate any data from cleancache associated with the poolid and the
>   * page's inode and page index so that a subsequent "get" will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_page(struct address_space *mapping,
>  					struct page *page)
>  {
>  	/* careful... page->mapping is NULL sometimes when this is called */
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> -	if (pool_id >= 0) {
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id >= 0) {
> +		pool_id = get_poolid_from_fake(fake_pool_id);
> +		if (pool_id < 0)
> +			return;
> +
>  		VM_BUG_ON(!PageLocked(page));
>  		if (cleancache_get_key(mapping->host, &key) >= 0) {
>  			(*cleancache_ops.invalidate_page)(pool_id,
> -							  key, page->index);
> +					key, page->index);
>  			cleancache_invalidates++;
>  		}
>  	}
> @@ -176,12 +340,25 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
>   * Invalidate all data from cleancache associated with the poolid and the
>   * mappings's inode so that all subsequent gets to this poolid/inode
>   * will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_inode(struct address_space *mapping)
>  {
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
>  		(*cleancache_ops.invalidate_inode)(pool_id, key);
>  }
> @@ -189,21 +366,37 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
>  
>  /*
>   * Called by any cleancache-enabled filesystem at time of unmount;
> - * note that pool_id is surrendered and may be reutrned by a subsequent
> - * cleancache_init_fs or cleancache_init_shared_fs
> + * note that pool_id is surrendered and may be returned by a subsequent
> + * cleancache_init_fs or cleancache_init_shared_fs.
>   */
>  void __cleancache_invalidate_fs(struct super_block *sb)
>  {
> -	if (sb->cleancache_poolid >= 0) {
> -		int old_poolid = sb->cleancache_poolid;
> -		sb->cleancache_poolid = -1;
> -		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	int index;
> +	int fake_pool_id = sb->cleancache_poolid;
> +	int old_poolid = fake_pool_id;
> +
> +	mutex_lock(&poolid_mutex);
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
> +		old_poolid = shared_fs_poolid_map[index];
> +		shared_fs_poolid_map[index] = FS_UNKNOWN;
> +		uuids[index] = NULL;
> +	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
> +		old_poolid = fs_poolid_map[index];
> +		fs_poolid_map[index] = FS_UNKNOWN;
>  	}
> +	sb->cleancache_poolid = -1;
> +	if (backend_registered)
> +		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_invalidate_fs);
>  
>  static int __init init_cleancache(void)
>  {
> +	int i;
> +
>  #ifdef CONFIG_DEBUG_FS
>  	struct dentry *root = debugfs_create_dir("cleancache", NULL);
>  	if (root == NULL)
> @@ -215,6 +408,11 @@ static int __init init_cleancache(void)
>  	debugfs_create_u64("invalidates", S_IRUGO,
>  				root, &cleancache_invalidates);
>  #endif
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		fs_poolid_map[i] = FS_UNKNOWN;
> +		shared_fs_poolid_map[i] = FS_UNKNOWN;
> +	}
> +	cleancache_enabled = 1;
>  	return 0;
>  }
>  module_init(init_cleancache)

next prev parent reply	other threads:[~2013-02-03  8:06 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
2013-02-01 20:22 ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-03  8:06   ` Ric Mason [this message]
2013-02-03  8:06     ` Ric Mason
2013-02-01 20:22 ` [PATCH 02/15] mm: frontswap: " Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-03  7:07   ` Ric Mason
2013-02-03  7:07     ` Ric Mason
2013-02-04  5:53     ` Bob Liu
2013-02-04  5:53       ` Bob Liu
2013-02-05  0:23       ` Ric Mason
2013-02-05  0:23         ` Ric Mason
2013-02-01 20:22 ` [PATCH 03/15] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 04/15] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 05/15] staging: zcache: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 06/15] staging: zcache: enable zcache " Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 07/15] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 08/15] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 09/15] frontswap: Remove the check for frontswap_enabled Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 10/15] frontswap: Use static_key instead of frontswap_enabled and frontswap_ops Konrad Rzeszutek Wilk
2013-02-01 20:22   ` Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 11/15] cleancache: Remove the check for cleancache_enabled Konrad Rzeszutek Wilk
2013-02-01 20:23   ` Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 12/15] cleancache: Use static_key instead of cleancache_ops and cleancache_enabled Konrad Rzeszutek Wilk
2013-02-01 20:23   ` Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 13/15] frontswap: Get rid of swap_lock dependency Konrad Rzeszutek Wilk
2013-02-01 20:23   ` Konrad Rzeszutek Wilk
2013-02-03  8:43   ` Wanpeng Li
2013-02-03  8:43   ` Wanpeng Li
2013-02-03  8:43   ` Wanpeng Li
2013-02-01 20:23 ` [PATCH 14/15] zcache/tmem: Better error checking on frontswap_register_ops return value Konrad Rzeszutek Wilk
2013-02-01 20:23   ` Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 15/15] xen/tmem: Add missing %s in the printk statement Konrad Rzeszutek Wilk
2013-02-01 20:23   ` Konrad Rzeszutek Wilk
2013-02-03  8:52 ` [PATCH v2] Make frontswap+cleancache and its friend be modularized Ric Mason
2013-02-03  8:52   ` Ric Mason
2013-02-04 15:14   ` Seth Jennings
2013-02-04 15:14     ` Seth Jennings
2013-02-05  0:21     ` Ric Mason
2013-02-05  0:21       ` Ric Mason
2013-02-05  0:38       ` Konrad Rzeszutek Wilk
2013-02-05  0:38         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1359878790.1328.9.camel@kernel.cn.ibm.com \
    --to=ric.masonn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andor.daam@googlemail.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=fschmaus@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ilendir@googlemail.com \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.