* [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes.
@ 2010-10-09 11:26 Tristan Ye
2010-10-09 11:26 ` [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option Tristan Ye
2010-10-10 10:51 ` [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Joel Becker
0 siblings, 2 replies; 6+ messages in thread
From: Tristan Ye @ 2010-10-09 11:26 UTC (permalink / raw)
To: ocfs2-devel
Currently, default behavior of O_DIRECT writes was allowing
concurrent writing among nodes, no cluster coherency guaranteed
(no EX locks was taken), it hurts buffered reads on other nodes
by reading stale data from cache.
The new mount option introduce a chance to choose two different
behaviors for O_DIRECT writes:
* coherency=full, as the default value, will disallow
concurrent O_DIRECT writes by taking
EX locks.
* coherency=buffered, allow concurrent O_DIRECT writes
without EX lock among nodes, which
gains high performance at risk of
getting stale data on other nodes.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
Documentation/filesystems/ocfs2.txt | 7 +++++++
fs/ocfs2/ocfs2.h | 3 +++
fs/ocfs2/super.c | 23 +++++++++++++++++++++++
3 files changed, 33 insertions(+), 0 deletions(-)
diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt
index 1f7ae14..5393e66 100644
--- a/Documentation/filesystems/ocfs2.txt
+++ b/Documentation/filesystems/ocfs2.txt
@@ -87,3 +87,10 @@ dir_resv_level= (*) By default, directory reservations will scale with file
reservations - users should rarely need to change this
value. If allocation reservations are turned off, this
option will have no effect.
+coherency=full (*) Disallow concurrent O_DIRECT writes, cluster inode
+ lock will be taken to force other nodes drop cache,
+ therefore full cluster coherency is guaranteed even
+ for O_DIRECT writes.
+coherency=buffered Allow concurrent O_DIRECT writes without EX lock among
+ nodes, which gains high performance at risk of getting
+ stale data on other nodes.
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index c67003b..2b987be 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -256,6 +256,9 @@ enum ocfs2_mount_options
control lists */
OCFS2_MOUNT_USRQUOTA = 1 << 10, /* We support user quotas */
OCFS2_MOUNT_GRPQUOTA = 1 << 11, /* We support group quotas */
+
+ OCFS2_MOUNT_COHERENCY_BUFFERED = 1 << 12 /* Allow concurrent O_DIRECT
+ writes */
};
#define OCFS2_OSB_SOFT_RO 0x0001
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index fa1be1b..36ee274 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -177,6 +177,8 @@ enum {
Opt_noacl,
Opt_usrquota,
Opt_grpquota,
+ Opt_coherency_buffered,
+ Opt_coherency_full,
Opt_resv_level,
Opt_dir_resv_level,
Opt_err,
@@ -205,6 +207,8 @@ static const match_table_t tokens = {
{Opt_noacl, "noacl"},
{Opt_usrquota, "usrquota"},
{Opt_grpquota, "grpquota"},
+ {Opt_coherency_buffered, "coherency=buffered"},
+ {Opt_coherency_full, "coherency=full"},
{Opt_resv_level, "resv_level=%u"},
{Opt_dir_resv_level, "dir_resv_level=%u"},
{Opt_err, NULL}
@@ -631,6 +635,14 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)
goto out;
}
+ if ((osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED) !=
+ (parsed_options.mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED)) {
+ ret = -EINVAL;
+ mlog(ML_ERROR, "Cannot change cluster coherency mode on "
+ "remount\n");
+ goto out;
+ }
+
/* Probably don't want this on remount; it might
* mess with other nodes */
if (!(osb->s_mount_opt & OCFS2_MOUNT_INODE64) &&
@@ -1438,6 +1450,12 @@ static int ocfs2_parse_options(struct super_block *sb,
case Opt_grpquota:
mopt->mount_opt |= OCFS2_MOUNT_GRPQUOTA;
break;
+ case Opt_coherency_buffered:
+ mopt->mount_opt |= OCFS2_MOUNT_COHERENCY_BUFFERED;
+ break;
+ case Opt_coherency_full:
+ mopt->mount_opt &= ~OCFS2_MOUNT_COHERENCY_BUFFERED;
+ break;
case Opt_acl:
mopt->mount_opt |= OCFS2_MOUNT_POSIX_ACL;
mopt->mount_opt &= ~OCFS2_MOUNT_NO_POSIX_ACL;
@@ -1536,6 +1554,11 @@ static int ocfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
if (opts & OCFS2_MOUNT_GRPQUOTA)
seq_printf(s, ",grpquota");
+ if (opts & OCFS2_MOUNT_COHERENCY_BUFFERED)
+ seq_printf(s, ",coherency=buffered");
+ else
+ seq_printf(s, ",coherency=full");
+
if (opts & OCFS2_MOUNT_NOUSERXATTR)
seq_printf(s, ",nouser_xattr");
else
--
1.5.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option.
2010-10-09 11:26 [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Tristan Ye
@ 2010-10-09 11:26 ` Tristan Ye
2010-10-10 10:59 ` Joel Becker
2010-10-10 10:51 ` [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Joel Becker
1 sibling, 1 reply; 6+ messages in thread
From: Tristan Ye @ 2010-10-09 11:26 UTC (permalink / raw)
To: ocfs2-devel
Change the default behavior to take the EX lock for all writes,
both buffered and O_DIRECT, then also allow concurrent O_DIRECT
writes by recognizing "coherency=buffered" option.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
fs/ocfs2/file.c | 30 ++++++++++++++++++++++++++++--
1 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 9a03c15..22710a0 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2255,14 +2255,40 @@ relock:
have_alloc_sem = 1;
}
- /* concurrent O_DIRECT writes are allowed */
- rw_level = !direct_io;
+ /*
+ * concurrent O_DIRECT writes are allowed with
+ * mount_option "coherency=buffered".
+ */
+ if (direct_io) {
+ rw_level = !(osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED);
+ } else
+ rw_level = !direct_io;
+
ret = ocfs2_rw_lock(inode, rw_level);
if (ret < 0) {
mlog_errno(ret);
goto out_sems;
}
+ /*
+ * O_DIRECT writes with "coherency=full" need to take EX cluster
+ * inode_lock to guarantee coherency.
+ */
+ if ((direct_io) &&
+ !(osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED)) {
+ ret = ocfs2_inode_lock(inode, NULL, 1);
+ if (ret < 0) {
+ mlog_errno(ret);
+ goto out_sems;
+ }
+
+ /*
+ * Safe to drop the inode_lock immediately since we're just
+ * telling other nodes to flush their cache.
+ */
+ ocfs2_inode_unlock(inode, 1);
+ }
+
can_do_direct = direct_io;
ret = ocfs2_prepare_inode_for_write(file->f_path.dentry, ppos,
iocb->ki_left, appending,
--
1.5.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes.
2010-10-09 11:26 [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Tristan Ye
2010-10-09 11:26 ` [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option Tristan Ye
@ 2010-10-10 10:51 ` Joel Becker
1 sibling, 0 replies; 6+ messages in thread
From: Joel Becker @ 2010-10-10 10:51 UTC (permalink / raw)
To: ocfs2-devel
On Sat, Oct 09, 2010 at 07:26:41PM +0800, Tristan Ye wrote:
> Currently, default behavior of O_DIRECT writes was allowing
> concurrent writing among nodes, no cluster coherency guaranteed
> (no EX locks was taken), it hurts buffered reads on other nodes
> by reading stale data from cache.
>
> The new mount option introduce a chance to choose two different
> behaviors for O_DIRECT writes:
>
> * coherency=full, as the default value, will disallow
> concurrent O_DIRECT writes by taking
> EX locks.
>
> * coherency=buffered, allow concurrent O_DIRECT writes
> without EX lock among nodes, which
> gains high performance at risk of
> getting stale data on other nodes.
>
> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
I think these two patches should be combined into one. It's not
a lot of change, and it makes sense as one unit.
> @@ -631,6 +635,14 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)
> goto out;
> }
>
> + if ((osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED) !=
> + (parsed_options.mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED)) {
> + ret = -EINVAL;
> + mlog(ML_ERROR, "Cannot change cluster coherency mode on "
> + "remount\n");
> + goto out;
> + }
Don't include this. I think it's a feature to allow coherency
mode to be changed on the fly. It doesn't hurt us any; it is a safe
thing to change.
Joel
--
Life's Little Instruction Book #207
"Swing for the fence."
Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option.
2010-10-09 11:26 ` [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option Tristan Ye
@ 2010-10-10 10:59 ` Joel Becker
2010-10-10 13:43 ` Tao Ma
0 siblings, 1 reply; 6+ messages in thread
From: Joel Becker @ 2010-10-10 10:59 UTC (permalink / raw)
To: ocfs2-devel
On Sat, Oct 09, 2010 at 07:26:42PM +0800, Tristan Ye wrote:
> - /* concurrent O_DIRECT writes are allowed */
> - rw_level = !direct_io;
> + /*
> + * concurrent O_DIRECT writes are allowed with
> + * mount_option "coherency=buffered".
> + */
> + if (direct_io) {
> + rw_level = !(osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED);
> + } else
> + rw_level = !direct_io;
> +
I think I'd like:
if (direct_io && (osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED))
rw_level = 0;
else
rw_level = 1;
It actually matches your comment much better. But since we're going to
be using it again later, perhaps you should set 'int full_coherency =
(osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED)' up at the top of
the function and then do:
rw_level = (!direct_io || full_coherency)
> ret = ocfs2_rw_lock(inode, rw_level);
> if (ret < 0) {
> mlog_errno(ret);
> goto out_sems;
> }
>
> + /*
> + * O_DIRECT writes with "coherency=full" need to take EX cluster
> + * inode_lock to guarantee coherency.
> + */
> + if ((direct_io) &&
> + !(osb->s_mount_opt & OCFS2_MOUNT_COHERENCY_BUFFERED)) {
Then this check can be:
if (direct_io && full_coherency) {
/*
* We need to take and drop the inode lock to force
* other nodes to drop their caches. Buffered I/O
* already does this in write_begin().
*/
> + ret = ocfs2_inode_lock(inode, NULL, 1);
> + if (ret < 0) {
> + mlog_errno(ret);
> + goto out_sems;
> + }
> +
> + /*
> + * Safe to drop the inode_lock immediately since we're just
> + * telling other nodes to flush their cache.
> + */
And you don't need this comment.
> + ocfs2_inode_unlock(inode, 1);
> + }
--
"The lawgiver, of all beings, most owes the law allegiance. He of all
men should behave as though the law compelled him. But it is the
universal weakness of mankind that what we are given to administer we
presently imagine we own."
- H.G. Wells
Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option.
2010-10-10 10:59 ` Joel Becker
@ 2010-10-10 13:43 ` Tao Ma
2010-10-10 19:47 ` Joel Becker
0 siblings, 1 reply; 6+ messages in thread
From: Tao Ma @ 2010-10-10 13:43 UTC (permalink / raw)
To: ocfs2-devel
? 2010-10-10 18:59, Joel Becker wrote:
> On Sat, Oct 09, 2010 at 07:26:42PM +0800, Tristan Ye wrote:
>> - /* concurrent O_DIRECT writes are allowed */
>> - rw_level = !direct_io;
>> + /*
>> + * concurrent O_DIRECT writes are allowed with
>> + * mount_option "coherency=buffered".
>> + */
>> + if (direct_io) {
>> + rw_level = !(osb->s_mount_opt& OCFS2_MOUNT_COHERENCY_BUFFERED);
>> + } else
>> + rw_level = !direct_io;
>> +
> I think I'd like:
>
> if (direct_io&& (osb->s_mount_opt& OCFS2_MOUNT_COHERENCY_BUFFERED))
> rw_level = 0;
> else
> rw_level = 1;
>
> It actually matches your comment much better. But since we're going to
> be using it again later, perhaps you should set 'int full_coherency =
> (osb->s_mount_opt& OCFS2_MOUNT_COHERENCY_BUFFERED)' up at the top of
> the function and then do:
>
> rw_level = (!direct_io || full_coherency)
yeah, it looks more natural.
>> ret = ocfs2_rw_lock(inode, rw_level);
>> if (ret< 0) {
>> mlog_errno(ret);
>> goto out_sems;
>> }
>>
>> + /*
>> + * O_DIRECT writes with "coherency=full" need to take EX cluster
>> + * inode_lock to guarantee coherency.
>> + */
>> + if ((direct_io)&&
>> + !(osb->s_mount_opt& OCFS2_MOUNT_COHERENCY_BUFFERED)) {
> Then this check can be:
>
> if (direct_io&& full_coherency) {
> /*
> * We need to take and drop the inode lock to force
> * other nodes to drop their caches. Buffered I/O
> * already does this in write_begin().
> */
>
>> + ret = ocfs2_inode_lock(inode, NULL, 1);
>> + if (ret< 0) {
>> + mlog_errno(ret);
>> + goto out_sems;
>> + }
>> +
>> + /*
>> + * Safe to drop the inode_lock immediately since we're just
>> + * telling other nodes to flush their cache.
>> + */
> And you don't need this comment.
I also have another concern. Do we really need the exclusive lock? I
think a PR lock should
let others to flush the cache for us.
Regards,
Tao
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option.
2010-10-10 13:43 ` Tao Ma
@ 2010-10-10 19:47 ` Joel Becker
0 siblings, 0 replies; 6+ messages in thread
From: Joel Becker @ 2010-10-10 19:47 UTC (permalink / raw)
To: ocfs2-devel
On Sun, Oct 10, 2010 at 09:43:52PM +0800, Tao Ma wrote:
> ? 2010-10-10 18:59, Joel Becker wrote:
> >On Sat, Oct 09, 2010 at 07:26:42PM +0800, Tristan Ye wrote:
> > if (direct_io&& full_coherency) {
> > /*
> > * We need to take and drop the inode lock to force
> > * other nodes to drop their caches. Buffered I/O
> > * already does this in write_begin().
> > */
> >
> >>+ ret = ocfs2_inode_lock(inode, NULL, 1);
> >>+ if (ret< 0) {
> >>+ mlog_errno(ret);
> >>+ goto out_sems;
> >>+ }
> >>+
> >>+ /*
> >>+ * Safe to drop the inode_lock immediately since we're just
> >>+ * telling other nodes to flush their cache.
> >>+ */
> > And you don't need this comment.
> I also have another concern. Do we really need the exclusive lock? I
> think a PR lock should
> let others to flush the cache for us.
If they already have a PR, they're not going to drop it just
because we take a PR. They're going to stay at PR and keep their cached
data, which is the right thing for them to do from a performance
standpoint.
Joel
--
"There is a country in Europe where multiple-choice tests are
illegal."
- Sigfried Hulzer
Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-10-10 19:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-09 11:26 [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Tristan Ye
2010-10-09 11:26 ` [Ocfs2-devel] [PATCH 2/2] Ocfs2: Handle O_DIRECT writes with coherency option Tristan Ye
2010-10-10 10:59 ` Joel Becker
2010-10-10 13:43 ` Tao Ma
2010-10-10 19:47 ` Joel Becker
2010-10-10 10:51 ` [Ocfs2-devel] [PATCH 1/2] Ocfs2: Add a mount option "coherency=*" for O_DIRECT writes Joel Becker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).