From: alex chen <alex.chen@huawei.com>
To: Gang He <ghe@suse.com>
Cc: mfasheh@versity.com, jlbec@evilplan.org,
linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com,
Andrew Morton <akpm@linux-foundation.org>
Subject: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster
Date: Thu, 4 Jan 2018 09:39:23 +0800 [thread overview]
Message-ID: <5A4D85CB.4020208@huawei.com> (raw)
In-Reply-To: <1513228484-2084-2-git-send-email-ghe@suse.com>
Hi Gang,
On 2017/12/14 13:14, Gang He wrote:
> As you know, ocfs2 has support trim the underlying disk via
> fstrim command. But there is a problem, ocfs2 is a shared disk
> cluster file system, if the user configures a scheduled fstrim
> job on each file system node, this will trigger multiple nodes
> trim a shared disk simultaneously, it is very wasteful for CPU
> and IO consumption, also might negatively affect the lifetime
> of poor-quality SSD devices.
> Then, we introduce a trimfs dlm lock to communicate with each
> other in this case, which will make only one fstrim command to
> do the trimming on a shared disk among the cluster, the fstrim
> commands from the other nodes should wait for the first fstrim
> to finish and returned success directly, to avoid running a the
> same trim on the shared disk again.
>
For the same purpose, can we take global bitmap meta lock EXMODE instead of
add a new trimfs dlm lock?
Thanks,
Alex
> Compare with first version, I change the fstrim commands' returned
> value and behavior in case which meets a fstrim command is running
> on a shared disk.
>
> Signed-off-by: Gang He <ghe@suse.com>
> ---
> fs/ocfs2/alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index ab5105f..5c9c3e2 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -7382,6 +7382,7 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> struct buffer_head *gd_bh = NULL;
> struct ocfs2_dinode *main_bm;
> struct ocfs2_group_desc *gd = NULL;
> + struct ocfs2_trim_fs_info info, *pinfo = NULL;
>
> start = range->start >> osb->s_clustersize_bits;
> len = range->len >> osb->s_clustersize_bits;
> @@ -7419,6 +7420,42 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
>
> trace_ocfs2_trim_fs(start, len, minlen);
>
> + ocfs2_trim_fs_lock_res_init(osb);
> + ret = ocfs2_trim_fs_lock(osb, NULL, 1);
> + if (ret < 0) {
> + if (ret != -EAGAIN) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + mlog(ML_NOTICE, "Wait for trim on device (%s) to "
> + "finish, which is running from another node.\n",
> + osb->dev_str);
> + ret = ocfs2_trim_fs_lock(osb, &info, 0);
> + if (ret < 0) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + if (info.tf_valid && info.tf_success &&
> + info.tf_start == start && info.tf_len == len &&
> + info.tf_minlen == minlen) {
> + /* Avoid sending duplicated trim to a shared device */
> + mlog(ML_NOTICE, "The same trim on device (%s) was "
> + "just done from node (%u), return.\n",
> + osb->dev_str, info.tf_nodenum);
> + range->len = info.tf_trimlen;
> + goto out_trimunlock;
> + }
> + }
> +
> + info.tf_nodenum = osb->node_num;
> + info.tf_start = start;
> + info.tf_len = len;
> + info.tf_minlen = minlen;
> +
> /* Determine first and last group to examine based on start and len */
> first_group = ocfs2_which_cluster_group(main_bm_inode, start);
> if (first_group == osb->first_cluster_group_blkno)
> @@ -7463,6 +7500,13 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> group += ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg);
> }
> range->len = trimmed * sb->s_blocksize;
> +
> + info.tf_trimlen = range->len;
> + info.tf_success = (ret ? 0 : 1);
> + pinfo = &info;
> +out_trimunlock:
> + ocfs2_trim_fs_unlock(osb, pinfo);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> out_unlock:
> ocfs2_inode_unlock(main_bm_inode, 0);
> brelse(main_bm_bh);
>
WARNING: multiple messages have this Message-ID (diff)
From: alex chen <alex.chen@huawei.com>
To: Gang He <ghe@suse.com>
Cc: <mfasheh@versity.com>, <jlbec@evilplan.org>,
<linux-kernel@vger.kernel.org>, <ocfs2-devel@oss.oracle.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster
Date: Thu, 4 Jan 2018 09:39:23 +0800 [thread overview]
Message-ID: <5A4D85CB.4020208@huawei.com> (raw)
In-Reply-To: <1513228484-2084-2-git-send-email-ghe@suse.com>
Hi Gang,
On 2017/12/14 13:14, Gang He wrote:
> As you know, ocfs2 has support trim the underlying disk via
> fstrim command. But there is a problem, ocfs2 is a shared disk
> cluster file system, if the user configures a scheduled fstrim
> job on each file system node, this will trigger multiple nodes
> trim a shared disk simultaneously, it is very wasteful for CPU
> and IO consumption, also might negatively affect the lifetime
> of poor-quality SSD devices.
> Then, we introduce a trimfs dlm lock to communicate with each
> other in this case, which will make only one fstrim command to
> do the trimming on a shared disk among the cluster, the fstrim
> commands from the other nodes should wait for the first fstrim
> to finish and returned success directly, to avoid running a the
> same trim on the shared disk again.
>
For the same purpose, can we take global bitmap meta lock EXMODE instead of
add a new trimfs dlm lock?
Thanks,
Alex
> Compare with first version, I change the fstrim commands' returned
> value and behavior in case which meets a fstrim command is running
> on a shared disk.
>
> Signed-off-by: Gang He <ghe@suse.com>
> ---
> fs/ocfs2/alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index ab5105f..5c9c3e2 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -7382,6 +7382,7 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> struct buffer_head *gd_bh = NULL;
> struct ocfs2_dinode *main_bm;
> struct ocfs2_group_desc *gd = NULL;
> + struct ocfs2_trim_fs_info info, *pinfo = NULL;
>
> start = range->start >> osb->s_clustersize_bits;
> len = range->len >> osb->s_clustersize_bits;
> @@ -7419,6 +7420,42 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
>
> trace_ocfs2_trim_fs(start, len, minlen);
>
> + ocfs2_trim_fs_lock_res_init(osb);
> + ret = ocfs2_trim_fs_lock(osb, NULL, 1);
> + if (ret < 0) {
> + if (ret != -EAGAIN) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + mlog(ML_NOTICE, "Wait for trim on device (%s) to "
> + "finish, which is running from another node.\n",
> + osb->dev_str);
> + ret = ocfs2_trim_fs_lock(osb, &info, 0);
> + if (ret < 0) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + if (info.tf_valid && info.tf_success &&
> + info.tf_start == start && info.tf_len == len &&
> + info.tf_minlen == minlen) {
> + /* Avoid sending duplicated trim to a shared device */
> + mlog(ML_NOTICE, "The same trim on device (%s) was "
> + "just done from node (%u), return.\n",
> + osb->dev_str, info.tf_nodenum);
> + range->len = info.tf_trimlen;
> + goto out_trimunlock;
> + }
> + }
> +
> + info.tf_nodenum = osb->node_num;
> + info.tf_start = start;
> + info.tf_len = len;
> + info.tf_minlen = minlen;
> +
> /* Determine first and last group to examine based on start and len */
> first_group = ocfs2_which_cluster_group(main_bm_inode, start);
> if (first_group == osb->first_cluster_group_blkno)
> @@ -7463,6 +7500,13 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> group += ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg);
> }
> range->len = trimmed * sb->s_blocksize;
> +
> + info.tf_trimlen = range->len;
> + info.tf_success = (ret ? 0 : 1);
> + pinfo = &info;
> +out_trimunlock:
> + ocfs2_trim_fs_unlock(osb, pinfo);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> out_unlock:
> ocfs2_inode_unlock(main_bm_inode, 0);
> brelse(main_bm_bh);
>
next prev parent reply other threads:[~2018-01-04 1:39 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-14 5:14 [Ocfs2-devel] [PATCH v2 1/2] ocfs2: add trimfs dlm lock resource Gang He
2017-12-14 5:14 ` Gang He
2017-12-14 5:14 ` [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster Gang He
2017-12-14 5:14 ` Gang He
2018-01-04 1:39 ` alex chen [this message]
2018-01-04 1:39 ` [Ocfs2-devel] " alex chen
2018-01-04 2:03 ` Gang He
2018-01-04 2:03 ` Gang He
2018-01-10 10:04 ` Changwei Ge
2018-01-10 10:04 ` Changwei Ge
2018-01-10 6:33 ` Changwei Ge
2018-01-10 6:33 ` Changwei Ge
2018-01-10 9:05 ` Gang He
2018-01-10 9:05 ` Gang He
2018-01-10 9:48 ` Changwei Ge
2018-01-10 9:48 ` Changwei Ge
2018-01-10 10:13 ` Gang He
2018-01-10 10:13 ` Gang He
2018-01-10 10:51 ` Changwei Ge
2018-01-10 10:51 ` Changwei Ge
2018-01-10 11:12 ` Changwei Ge
2018-01-10 11:12 ` Changwei Ge
2018-01-11 2:06 ` Gang He
2018-01-11 2:06 ` Gang He
2018-01-11 3:00 ` Changwei Ge
2018-01-11 3:00 ` Changwei Ge
2018-01-11 3:33 ` Gang He
2018-01-11 3:33 ` Gang He
2018-01-11 3:41 ` Changwei Ge
2018-01-11 3:41 ` Changwei Ge
2018-01-11 4:31 ` Gang He
2018-01-11 4:31 ` Gang He
2018-01-11 6:59 ` Changwei Ge
2018-01-11 6:59 ` Changwei Ge
2018-01-11 7:18 ` Gang He
2018-01-11 7:18 ` Gang He
2018-01-11 7:40 ` Changwei Ge
2018-01-11 7:40 ` Changwei Ge
2018-01-11 7:56 ` Gang He
2018-01-11 7:56 ` Gang He
2018-01-11 10:07 ` Changwei Ge
2018-01-11 10:07 ` Changwei Ge
2018-01-17 7:40 ` Changwei Ge
2018-01-17 7:40 ` Changwei Ge
2018-01-10 6:40 ` [Ocfs2-devel] [PATCH v2 1/2] ocfs2: add trimfs dlm lock resource Changwei Ge
2018-01-10 6:40 ` Changwei Ge
2018-01-10 8:40 ` Gang He
2018-01-10 8:40 ` Gang He
2018-01-17 7:39 ` Changwei Ge
2018-01-17 7:39 ` Changwei Ge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A4D85CB.4020208@huawei.com \
--to=alex.chen@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=ghe@suse.com \
--cc=jlbec@evilplan.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mfasheh@versity.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.