From: alex chen <alex.chen@huawei.com>
To: Gang He <ghe@suse.com>
Cc: <mfasheh@versity.com>, <jlbec@evilplan.org>,
<linux-kernel@vger.kernel.org>, <ocfs2-devel@oss.oracle.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster
Date: Thu, 4 Jan 2018 09:39:23 +0800 [thread overview]
Message-ID: <5A4D85CB.4020208@huawei.com> (raw)
In-Reply-To: <1513228484-2084-2-git-send-email-ghe@suse.com>
Hi Gang,
On 2017/12/14 13:14, Gang He wrote:
> As you know, ocfs2 has support trim the underlying disk via
> fstrim command. But there is a problem, ocfs2 is a shared disk
> cluster file system, if the user configures a scheduled fstrim
> job on each file system node, this will trigger multiple nodes
> trim a shared disk simultaneously, it is very wasteful for CPU
> and IO consumption, also might negatively affect the lifetime
> of poor-quality SSD devices.
> Then, we introduce a trimfs dlm lock to communicate with each
> other in this case, which will make only one fstrim command to
> do the trimming on a shared disk among the cluster, the fstrim
> commands from the other nodes should wait for the first fstrim
> to finish and returned success directly, to avoid running a the
> same trim on the shared disk again.
>
For the same purpose, can we take global bitmap meta lock EXMODE instead of
add a new trimfs dlm lock?
Thanks,
Alex
> Compare with first version, I change the fstrim commands' returned
> value and behavior in case which meets a fstrim command is running
> on a shared disk.
>
> Signed-off-by: Gang He <ghe@suse.com>
> ---
> fs/ocfs2/alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index ab5105f..5c9c3e2 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -7382,6 +7382,7 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> struct buffer_head *gd_bh = NULL;
> struct ocfs2_dinode *main_bm;
> struct ocfs2_group_desc *gd = NULL;
> + struct ocfs2_trim_fs_info info, *pinfo = NULL;
>
> start = range->start >> osb->s_clustersize_bits;
> len = range->len >> osb->s_clustersize_bits;
> @@ -7419,6 +7420,42 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
>
> trace_ocfs2_trim_fs(start, len, minlen);
>
> + ocfs2_trim_fs_lock_res_init(osb);
> + ret = ocfs2_trim_fs_lock(osb, NULL, 1);
> + if (ret < 0) {
> + if (ret != -EAGAIN) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + mlog(ML_NOTICE, "Wait for trim on device (%s) to "
> + "finish, which is running from another node.\n",
> + osb->dev_str);
> + ret = ocfs2_trim_fs_lock(osb, &info, 0);
> + if (ret < 0) {
> + mlog_errno(ret);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> + goto out_unlock;
> + }
> +
> + if (info.tf_valid && info.tf_success &&
> + info.tf_start == start && info.tf_len == len &&
> + info.tf_minlen == minlen) {
> + /* Avoid sending duplicated trim to a shared device */
> + mlog(ML_NOTICE, "The same trim on device (%s) was "
> + "just done from node (%u), return.\n",
> + osb->dev_str, info.tf_nodenum);
> + range->len = info.tf_trimlen;
> + goto out_trimunlock;
> + }
> + }
> +
> + info.tf_nodenum = osb->node_num;
> + info.tf_start = start;
> + info.tf_len = len;
> + info.tf_minlen = minlen;
> +
> /* Determine first and last group to examine based on start and len */
> first_group = ocfs2_which_cluster_group(main_bm_inode, start);
> if (first_group == osb->first_cluster_group_blkno)
> @@ -7463,6 +7500,13 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> group += ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg);
> }
> range->len = trimmed * sb->s_blocksize;
> +
> + info.tf_trimlen = range->len;
> + info.tf_success = (ret ? 0 : 1);
> + pinfo = &info;
> +out_trimunlock:
> + ocfs2_trim_fs_unlock(osb, pinfo);
> + ocfs2_trim_fs_lock_res_uninit(osb);
> out_unlock:
> ocfs2_inode_unlock(main_bm_inode, 0);
> brelse(main_bm_bh);
>
next prev parent reply other threads:[~2018-01-04 1:39 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-14 5:14 [PATCH v2 1/2] ocfs2: add trimfs dlm lock resource Gang He
2017-12-14 5:14 ` [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster Gang He
2018-01-04 1:39 ` alex chen [this message]
2018-01-04 2:03 ` [Ocfs2-devel] " Gang He
2018-01-10 10:04 ` Changwei Ge
2018-01-10 6:33 ` Changwei Ge
2018-01-10 9:05 ` Gang He
2018-01-10 9:48 ` Changwei Ge
2018-01-10 10:13 ` Gang He
2018-01-10 10:51 ` Changwei Ge
2018-01-10 11:12 ` Changwei Ge
2018-01-11 2:06 ` Gang He
2018-01-11 3:00 ` Changwei Ge
2018-01-11 3:33 ` Gang He
2018-01-11 3:41 ` Changwei Ge
2018-01-11 4:31 ` Gang He
2018-01-11 6:59 ` Changwei Ge
2018-01-11 7:18 ` Gang He
2018-01-11 7:40 ` Changwei Ge
2018-01-11 7:56 ` Gang He
2018-01-11 10:07 ` Changwei Ge
2018-01-17 7:40 ` Changwei Ge
2018-01-10 6:40 ` [Ocfs2-devel] [PATCH v2 1/2] ocfs2: add trimfs dlm lock resource Changwei Ge
2018-01-10 8:40 ` Gang He
2018-01-17 7:39 ` Changwei Ge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A4D85CB.4020208@huawei.com \
--to=alex.chen@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=ghe@suse.com \
--cc=jlbec@evilplan.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mfasheh@versity.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox