linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: zwu.kernel@gmail.com
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxram@linux.vnet.ibm.com, viro@zeniv.linux.org.uk,
	dave@jikos.cz, tytso@mit.edu, cmm@us.ibm.com,
	Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Subject: Re: [RFC v3 09/13] vfs: add one wq to update map info periodically
Date: Tue, 16 Oct 2012 11:27:44 +1100	[thread overview]
Message-ID: <20121016002744.GC2864@dastard> (raw)
In-Reply-To: <1349863655-29320-10-git-send-email-zwu.kernel@gmail.com>

On Wed, Oct 10, 2012 at 06:07:31PM +0800, zwu.kernel@gmail.com wrote:
> From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> 
>   Add a per-superblock workqueue and a work_struct
>  to run periodic work to update map info on each superblock.
> 
> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> ---
>  fs/hot_tracking.c            |   94 ++++++++++++++++++++++++++++++++++++++++++
>  fs/hot_tracking.h            |    3 +
>  include/linux/hot_tracking.h |    2 +
>  3 files changed, 99 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
> index a8dc599..f333c47 100644
> --- a/fs/hot_tracking.c
> +++ b/fs/hot_tracking.c
> @@ -15,6 +15,8 @@
>  #include <linux/module.h>
>  #include <linux/spinlock.h>
>  #include <linux/hardirq.h>
> +#include <linux/kthread.h>
> +#include <linux/freezer.h>
>  #include <linux/fs.h>
>  #include <linux/blkdev.h>
>  #include <linux/types.h>
> @@ -623,6 +625,88 @@ static void hot_map_array_exit(struct hot_info *root)
>  }
>  
>  /*
> + * Update temperatures for each hot inode item and
> + * hot range item for aging purposes
> + */
> +static void hot_temperature_update_work(struct work_struct *work)
> +{
> +	struct hot_update_work *hot_work =
> +			container_of(work, struct hot_update_work, work);
> +	struct hot_info *root = hot_work->hot_info;
> +	struct hot_inode_item *hi_nodes[8];
> +	unsigned long delay = HZ * HEAT_UPDATE_DELAY;
> +	u64 ino = 0;
> +	int i, n;
> +
> +	do {
> +		while (1) {
> +			spin_lock(&root->lock);
> +			n = radix_tree_gang_lookup(&root->hot_inode_tree,
> +					   (void **)hi_nodes, ino,
> +					   ARRAY_SIZE(hi_nodes));
> +			if (!n) {
> +				spin_unlock(&root->lock);
> +				break;
> +			}
> +
> +			ino = hi_nodes[n - 1]->i_ino + 1;
> +			for (i = 0; i < n; i++) {
> +				kref_get(&hi_nodes[i]->hot_inode.refs);
> +				hot_map_array_update(
> +					&hi_nodes[i]->hot_inode.hot_freq_data, root);
> +				hot_range_update(hi_nodes[i], root);
> +				hot_inode_item_put(hi_nodes[i]);
> +			}
> +			spin_unlock(&root->lock);

This is a lot of work to do under a spin lock. Perhaps you should
get a reference on all the nodes, then drop the root->lock and then
update all the nodes in a separate loop.

> +		}
> +
> +		if (unlikely(freezing(current))) {
> +			__refrigerator(true);
> +		} else {
> +			set_current_state(TASK_INTERRUPTIBLE);
> +			if (!kthread_should_stop()) {
> +				schedule_timeout(delay);
> +			}
> +			__set_current_state(TASK_RUNNING);
> +		}
> +	} while (!kthread_should_stop());

I don't think you understand workqueues fully. A work queue worker
function is not something that executes endlessly. It is a
"one-shot" function that does the work once, not an endless loop
that has to delay it's execution for periodic work.

If you need periodic work, then you should use a struct delayed_work
and queue the next work iteration to be run a later time. See, for
example, xfs_syncd_worker() and xfs_syncd_queue_sync() and how that
reschedules itself for periodic work. It also means you don't have
to handle kthread freezing, as the WQ infrastructure takes care of
that for you.

This is why unmount is hanging for me - this work never completes,
so flush_workqueue() will never return.

> +}
> +
> +static int hot_wq_init(struct hot_info *root)
> +{
> +	struct hot_update_work *hot_work;
> +	int ret = 0;
> +
> +	root->update_wq = alloc_workqueue(
> +		"hot_temperature_update", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
> +	if (!root->update_wq) {
> +		printk(KERN_ERR "%s: failed to create "
> +			"temperature update workqueue\n",
> +			__func__);
> +		return 1;
> +	}
> +
> +	hot_work = kmalloc(sizeof(*hot_work), GFP_NOFS);
> +	if (hot_work) {
> +		hot_work->hot_info = root;
> +		INIT_WORK(&hot_work->work, hot_temperature_update_work);
> +		queue_work(root->update_wq, &hot_work->work);
> +	} else {
> +		printk(KERN_ERR "%s: failed to create update work\n",
> +				__func__);
> +		ret = 1;
> +	}

I don't understand why you need a separate "hot_work" structure.
just embed a struct delayed_work in the struct hot_info and use
container_of() to get the struct hot_info from the work structure.
As such, there's no need for a separate function just for this
initialisation - just put it in line.

> +
> +	return ret;
> +}
> +
> +static void hot_wq_exit(struct workqueue_struct *wq)
> +{
> +	flush_workqueue(wq);

flush_workqueue_sync().

> +	destroy_workqueue(wq);
> +}

And there's not need for separate function for this - put it in
line.

FWIW, it also leaks the hot_work structure, but you're going to
remove that anyway. ;)

> diff --git a/fs/hot_tracking.h b/fs/hot_tracking.h
> index d19e64a..7a79a6d 100644
> --- a/fs/hot_tracking.h
> +++ b/fs/hot_tracking.h
> @@ -36,6 +36,9 @@
>   */
>  #define TIME_TO_KICK 400
>  
> +/* set how often to update temperatures (seconds) */
> +#define HEAT_UPDATE_DELAY 400

FWIW, 400 seconds is an unusual time period. It's expected that
periodic work might take place at intervals of 5 minutes, 10
minutes, etc, not 6m40s. It's much easier to predict and understand
behaviour if it's at a interval of whole units like minutes,
especially when looking at timestamped event traces. Hence 300s (5
minutes) makes a lot more sense as a period for updates...

>  /*
>   * The following comments explain what exactly comprises a unit of heat.
>   *
> diff --git a/include/linux/hot_tracking.h b/include/linux/hot_tracking.h
> index 7114179..b37e0f8 100644
> --- a/include/linux/hot_tracking.h
> +++ b/include/linux/hot_tracking.h
> @@ -84,6 +84,8 @@ struct hot_info {
>  
>  	/* map of range temperature */
>  	struct hot_map_head heat_range_map[HEAT_MAP_SIZE];
> +
> +	struct workqueue_struct *update_wq;

Add the struct delayed_work here, too.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2012-10-16  0:27 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-10 10:07 [RFC v3 00/13] vfs: hot data tracking zwu.kernel
2012-10-10 10:07 ` [RFC v3 01/13] btrfs: add one new mount option '-o hot_track' zwu.kernel
     [not found]   ` <5075632c.03cc440a.1b33.7805SMTPIN_ADDED@mx.google.com>
2012-10-10 12:21     ` Zhi Yong Wu
2012-10-10 13:11       ` Lukáš Czerner
2012-10-10 13:16         ` Zhi Yong Wu
2012-10-10 16:28   ` David Sterba
2012-10-11 13:41     ` Zhi Yong Wu
2012-10-11 14:35     ` Zhi Yong Wu
2012-10-11 14:41       ` David Sterba
2012-10-11 14:46         ` Zhi Yong Wu
2012-10-10 10:07 ` [RFC v3 02/13] vfs: introduce private radix tree structures zwu.kernel
2012-10-10 15:34   ` David Sterba
2012-10-11 13:35     ` Zhi Yong Wu
2012-10-10 10:07 ` [RFC v3 03/13] vfs: Initialize and free main data structures zwu.kernel
2012-10-10 10:07 ` [RFC v3 04/13] vfs: add function for collecting raw access info zwu.kernel
2012-10-10 10:07 ` [RFC v3 05/13] vfs: add two map arrays zwu.kernel
2012-10-10 10:07 ` [RFC v3 06/13] vfs: add hooks to enable hot data tracking zwu.kernel
2012-10-10 10:07 ` [RFC v3 07/13] vfs: add function for updating map arrays zwu.kernel
2012-10-10 10:07 ` [RFC v3 08/13] vfs: add aging function for old map info zwu.kernel
2012-10-10 10:07 ` [RFC v3 09/13] vfs: add one wq to update map info periodically zwu.kernel
2012-10-16  0:27   ` Dave Chinner [this message]
2012-10-17  6:34     ` Zhi Yong Wu
2012-10-18  2:25       ` Zheng Liu
2012-10-18  2:26         ` Zhi Yong Wu
2012-10-10 10:07 ` [RFC v3 10/13] vfs: register one memory shrinker zwu.kernel
2012-10-10 10:07 ` [RFC v3 11/13] vfs: add 3 new ioctl interfaces zwu.kernel
2012-10-15  7:48   ` Dave Chinner
2012-10-15  7:57     ` Zhi Yong Wu
2012-10-16  3:17   ` Dave Chinner
2012-10-16  4:18     ` Zhi Yong Wu
2012-10-19  8:21     ` Zhi Yong Wu
2012-10-10 10:07 ` [RFC v3 12/13] vfs: add debugfs support zwu.kernel
2012-10-10 16:53   ` David Sterba
2012-10-10 21:05   ` David Sterba
2012-10-15  7:55   ` Dave Chinner
2012-10-15  8:15     ` Zhi Yong Wu
2012-10-15  8:04   ` Dave Chinner
2012-10-15  8:47     ` Zhi Yong Wu
2012-10-10 10:07 ` [RFC v3 13/13] vfs: add documentation zwu.kernel
2012-10-15  0:35   ` Zheng Liu
2012-10-15  7:04     ` Zhi Yong Wu
2012-10-15  0:39 ` [RFC v3 00/13] vfs: hot data tracking Zheng Liu
2012-10-15  7:05   ` Zhi Yong Wu
2012-10-15 20:42 ` Dave Chinner
2012-10-17  8:57   ` Zhi Yong Wu
2012-10-18  4:29     ` Dave Chinner
2012-10-18  4:44       ` Zhi Yong Wu
2012-10-18  5:17         ` Dave Chinner
2012-10-18  5:24           ` Zhi Yong Wu
2012-10-19  8:29   ` Zhi Yong Wu
2012-10-16  0:04 ` [PATCH] xfs: add hot tracking support Dave Chinner
2012-11-07  8:38   ` Zhi Yong Wu
2012-11-08  5:13     ` Dave Chinner
2012-10-16  0:11 ` [RFC v3 00/13] vfs: hot data tracking Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121016002744.GC2864@dastard \
    --to=david@fromorbit.com \
    --cc=cmm@us.ibm.com \
    --cc=dave@jikos.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxram@linux.vnet.ibm.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wuzhy@linux.vnet.ibm.com \
    --cc=zwu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).