[Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush.
@ 2010-09-16  6:19 Tao Ma
  2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 1/3] ocfs2: flush truncate log in case it contains too many clusters Tao Ma
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Tao Ma @ 2010-09-16  6:19 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,
	Recently, one of our colleagues meet with a problem that if we 
write/delete a 32mb files repeatly, we will get a ENOSPC in the end. And 
the corresponding bug is 1288.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1288

So this patch set just tries to resolve it. It includes 3 patches:
0001 is a separate one. It just adds a new watermark for truncate log, 
FLUSH_TRUNCATE_LOG_RATIO. So if the truncate log has collected too much 
clusters, ocfs2_truncate_log_needs_flush will tell the caller to flush 
immediately.

0002-0003 are a little complicate here.
So in general, even we have freed the clusters to the global bitmap(by 
the above flush), we can't use them because they haven't been committed 
to the disk(check the comments above ocfs2_test_bg_bit_allocatable for 
details). So we have to tell jbd2 to flush immediately.
0002 reverts some change in commit c271c5c so that now even in local 
mode we can force jbd2 to flush immediately.
0003 add the 'checkpoint' to ocfs2_truncate_log_needs_flush, so that the 
callers know that we need to checkpoint after the flush.

Actually 0003 can work without 0002 since in cluster env, 
ocfs2_start_checkpoint will work and 0002 is just used for a local mode. 
So feel free to drop 0002 if we decide to still keep ocfs2cmt out of 
local mode.

Regards,
Tao

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 1/3] ocfs2: flush truncate log in case it contains too many clusters.
  2010-09-16  6:19 [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Tao Ma
@ 2010-09-16  6:21 ` Tao Ma
  2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 2/3] ocfs2: Start ocfs2cmt even in local mode Tao Ma
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Tao Ma @ 2010-09-16  6:21 UTC (permalink / raw)
  To: ocfs2-devel

When we test whether we need to flush truncate log in
ocfs2_truncate_log_needs_flush, we only take care of
whether the truncate log is full. But if the volume is
small and  we have large block size(in this case truncate
log can store too many records), we may be too late
for flushing if the user create/write/delete files quickly.

So I add a new FLUSH_TRUNCATE_LOG_RATIO so that we will also
check whether the number of truncated clusters has reached
a watermark, if yes, flush the truncate log.
It resolves the ossbug #1288 somehow.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/alloc.c |   25 ++++++++++++++++++++++++-
 fs/ocfs2/ocfs2.h |    2 ++
 2 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 592fae5..c765447 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -5751,11 +5751,13 @@ out:
 	return ret;
 }
 
+#define FLUSH_TRUNCATE_LOG_RATIO	10
 int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb)
 {
 	struct buffer_head *tl_bh = osb->osb_tl_bh;
 	struct ocfs2_dinode *di;
 	struct ocfs2_truncate_log *tl;
+	int flush = 0;
 
 	di = (struct ocfs2_dinode *) tl_bh->b_data;
 	tl = &di->id2.i_dealloc;
@@ -5764,7 +5766,25 @@ int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb)
 			"slot %d, invalid truncate log parameters: used = "
 			"%u, count = %u\n", osb->slot_num,
 			le16_to_cpu(tl->tl_used), le16_to_cpu(tl->tl_count));
-	return le16_to_cpu(tl->tl_used) == le16_to_cpu(tl->tl_count);
+	if (le16_to_cpu(tl->tl_used) == le16_to_cpu(tl->tl_count))
+		flush = 1;
+	else {
+		/*
+		 * Check whether we have reserved enough clusters
+		 * to flush.
+		 */
+		u32 watermark = osb->osb_clusters_at_boot / 100 *
+				FLUSH_TRUNCATE_LOG_RATIO;
+
+		if (watermark < osb->truncated_clusters) {
+			mlog(0, "flush truncate log: watermark %u,"
+			     " we have %u clusters truncated\n",
+			     watermark, osb->truncated_clusters);
+			flush = 1;
+		}
+	}
+
+	return flush;
 }
 
 static int ocfs2_truncate_log_can_coalesce(struct ocfs2_truncate_log *tl,
@@ -5858,6 +5878,7 @@ int ocfs2_truncate_log_append(struct ocfs2_super *osb,
 
 	ocfs2_journal_dirty(handle, tl_bh);
 
+	osb->truncated_clusters += num_clusters;
 bail:
 	mlog_exit(status);
 	return status;
@@ -5929,6 +5950,8 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
 		i--;
 	}
 
+	osb->truncated_clusters = 0;
+
 bail:
 	mlog_exit(status);
 	return status;
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index c67003b..5f47883 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -425,6 +425,8 @@ struct ocfs2_super
 	/* rb tree root for refcount lock. */
 	struct rb_root	osb_rf_lock_tree;
 	struct ocfs2_refcount_tree *osb_ref_tree_lru;
+
+	unsigned int truncated_clusters;
 };
 
 #define OCFS2_SB(sb)	    ((struct ocfs2_super *)(sb)->s_fs_info)
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 2/3] ocfs2: Start ocfs2cmt even in local mode.
  2010-09-16  6:19 [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Tao Ma
  2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 1/3] ocfs2: flush truncate log in case it contains too many clusters Tao Ma
@ 2010-09-16  6:21 ` Tao Ma
  2010-09-16  6:22 ` [Ocfs2-devel] [PATCH 3/3] ocfs2: Start journal checkpoint if we have too many truncated clusters Tao Ma
  2010-09-16  6:41 ` [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Joel Becker
  3 siblings, 0 replies; 7+ messages in thread
From: Tao Ma @ 2010-09-16  6:21 UTC (permalink / raw)
  To: ocfs2-devel

Currently in local mode, we don't start ocfs2cmt thread,
so all the journal flush work is done by jbd2.

But we now need to flush the journal if the truncate log has
collected too much clusters because in this case all the bits
which hasn't been committed can't be allocated again(check
function ocfs2_test_bg_bit_allocatable for details).

This patch just reverts some modification of commit c271c5c
so that now we can start journal commit by ourselves by calling
ocfs2_start_checkpoint. The changes include:
1. Start ocfs2cmt no matter whether we are in local mode or not.
2. Remove the 'local' parameter from ocfs2_journal_load since now
   we don't care.
3. Always increase j_num_trans in ocfs2_start_trans to make commit
   thread work.
4. Revert the change in ocfs2_journal_shutdown.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/journal.c |   45 ++++++++++++---------------------------------
 fs/ocfs2/journal.h |    3 +--
 fs/ocfs2/super.c   |    5 +----
 3 files changed, 14 insertions(+), 39 deletions(-)

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 9b57c03..e927532 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -374,10 +374,8 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb, int max_buffs)
 			ocfs2_abort(osb->sb, "Detected aborted journal");
 			handle = ERR_PTR(-EROFS);
 		}
-	} else {
-		if (!ocfs2_mount_local(osb))
-			atomic_inc(&(osb->journal->j_num_trans));
-	}
+	} else
+		atomic_inc(&(osb->journal->j_num_trans));
 
 	return handle;
 }
@@ -966,23 +964,9 @@ void ocfs2_journal_shutdown(struct ocfs2_super *osb)
 
 	BUG_ON(atomic_read(&(osb->journal->j_num_trans)) != 0);
 
-	if (ocfs2_mount_local(osb)) {
-		jbd2_journal_lock_updates(journal->j_journal);
-		status = jbd2_journal_flush(journal->j_journal);
-		jbd2_journal_unlock_updates(journal->j_journal);
-		if (status < 0)
-			mlog_errno(status);
-	}
-
-	if (status == 0) {
-		/*
-		 * Do not toggle if flush was unsuccessful otherwise
-		 * will leave dirty metadata in a "clean" journal
-		 */
-		status = ocfs2_journal_toggle_dirty(osb, 0, 0);
-		if (status < 0)
-			mlog_errno(status);
-	}
+	status = ocfs2_journal_toggle_dirty(osb, 0, 0);
+	if (status < 0)
+		mlog_errno(status);
 
 	/* Shutdown the kernel journal system */
 	jbd2_journal_destroy(journal->j_journal);
@@ -1023,7 +1007,7 @@ static void ocfs2_clear_journal_error(struct super_block *sb,
 	}
 }
 
-int ocfs2_journal_load(struct ocfs2_journal *journal, int local, int replayed)
+int ocfs2_journal_load(struct ocfs2_journal *journal, int replayed)
 {
 	int status = 0;
 	struct ocfs2_super *osb;
@@ -1049,18 +1033,13 @@ int ocfs2_journal_load(struct ocfs2_journal *journal, int local, int replayed)
 	}
 
 	/* Launch the commit thread */
-	if (!local) {
-		osb->commit_task = kthread_run(ocfs2_commit_thread, osb,
-					       "ocfs2cmt");
-		if (IS_ERR(osb->commit_task)) {
-			status = PTR_ERR(osb->commit_task);
-			osb->commit_task = NULL;
-			mlog(ML_ERROR, "unable to launch ocfs2commit thread, "
-			     "error=%d", status);
-			goto done;
-		}
-	} else
+	osb->commit_task = kthread_run(ocfs2_commit_thread, osb, "ocfs2cmt");
+	if (IS_ERR(osb->commit_task)) {
+		status = PTR_ERR(osb->commit_task);
 		osb->commit_task = NULL;
+		mlog(ML_ERROR, "unable to launch ocfs2commit thread, "
+		     "error=%d", status);
+	}
 
 done:
 	mlog_exit(status);
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index b5baaa8..b960d5a 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -188,8 +188,7 @@ int    ocfs2_journal_init(struct ocfs2_journal *journal,
 void   ocfs2_journal_shutdown(struct ocfs2_super *osb);
 int    ocfs2_journal_wipe(struct ocfs2_journal *journal,
 			  int full);
-int    ocfs2_journal_load(struct ocfs2_journal *journal, int local,
-			  int replayed);
+int    ocfs2_journal_load(struct ocfs2_journal *journal, int replayed);
 int    ocfs2_check_journals_nolocks(struct ocfs2_super *osb);
 void   ocfs2_recovery_thread(struct ocfs2_super *osb,
 			     int node_num);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index fa1be1b..3aa44cf 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2366,7 +2366,6 @@ static int ocfs2_check_volume(struct ocfs2_super *osb)
 {
 	int status;
 	int dirty;
-	int local;
 	struct ocfs2_dinode *local_alloc = NULL; /* only used if we
 						  * recover
 						  * ourselves. */
@@ -2394,10 +2393,8 @@ static int ocfs2_check_volume(struct ocfs2_super *osb)
 		     "recovering volume.\n");
 	}
 
-	local = ocfs2_mount_local(osb);
-
 	/* will play back anything left in the journal. */
-	status = ocfs2_journal_load(osb->journal, local, dirty);
+	status = ocfs2_journal_load(osb->journal, dirty);
 	if (status < 0) {
 		mlog(ML_ERROR, "ocfs2 journal load failed! %d\n", status);
 		goto finally;
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] ocfs2: Start journal checkpoint if we have too many truncated clusters.
  2010-09-16  6:19 [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Tao Ma
  2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 1/3] ocfs2: flush truncate log in case it contains too many clusters Tao Ma
  2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 2/3] ocfs2: Start ocfs2cmt even in local mode Tao Ma
@ 2010-09-16  6:22 ` Tao Ma
  2010-09-16  6:41 ` [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Joel Becker
  3 siblings, 0 replies; 7+ messages in thread
From: Tao Ma @ 2010-09-16  6:22 UTC (permalink / raw)
  To: ocfs2-devel

Add a new para 'checkpoint' in ocfs2_truncate_log_needs_flush,
if it finds we need to checkpoint the journal, start it.

For xattr truncate, I don't pass the parameter now since the value
now has a limit of 64K, so I don't think we need a checkpoint there.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/alloc.c |   25 ++++++++++++++++++-------
 fs/ocfs2/alloc.h |    2 +-
 fs/ocfs2/xattr.c |    4 ++--
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index c765447..218bc2b 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -5646,7 +5646,7 @@ int ocfs2_remove_btree_range(struct inode *inode,
 			     struct ocfs2_cached_dealloc_ctxt *dealloc,
 			     u64 refcount_loc)
 {
-	int ret, credits = 0, extra_blocks = 0;
+	int ret, credits = 0, extra_blocks = 0, checkpoint = 0;
 	u64 phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos);
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 	struct inode *tl_inode = osb->osb_tl_inode;
@@ -5686,7 +5686,7 @@ int ocfs2_remove_btree_range(struct inode *inode,
 
 	mutex_lock(&tl_inode->i_mutex);
 
-	if (ocfs2_truncate_log_needs_flush(osb)) {
+	if (ocfs2_truncate_log_needs_flush(osb, &checkpoint)) {
 		ret = __ocfs2_flush_truncate_log(osb);
 		if (ret < 0) {
 			mlog_errno(ret);
@@ -5748,11 +5748,14 @@ out:
 	if (ref_tree)
 		ocfs2_unlock_refcount_tree(osb, ref_tree, 1);
 
+	if (!ret && checkpoint)
+		ocfs2_start_checkpoint(osb);
+
 	return ret;
 }
 
 #define FLUSH_TRUNCATE_LOG_RATIO	10
-int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb)
+int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb, int *checkpoint)
 {
 	struct buffer_head *tl_bh = osb->osb_tl_bh;
 	struct ocfs2_dinode *di;
@@ -5781,6 +5784,9 @@ int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb)
 			     " we have %u clusters truncated\n",
 			     watermark, osb->truncated_clusters);
 			flush = 1;
+
+			if (checkpoint)
+				*checkpoint = 1;
 		}
 	}
 
@@ -6187,7 +6193,7 @@ bail:
 int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
 					 struct ocfs2_dinode *tl_copy)
 {
-	int status = 0;
+	int status = 0, checkpoint = 0;
 	int i;
 	unsigned int clusters, num_recs, start_cluster;
 	u64 start_blk;
@@ -6209,7 +6215,7 @@ int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
 
 	mutex_lock(&tl_inode->i_mutex);
 	for(i = 0; i < num_recs; i++) {
-		if (ocfs2_truncate_log_needs_flush(osb)) {
+		if (ocfs2_truncate_log_needs_flush(osb, &checkpoint)) {
 			status = __ocfs2_flush_truncate_log(osb);
 			if (status < 0) {
 				mlog_errno(status);
@@ -6240,6 +6246,9 @@ int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
 bail_up:
 	mutex_unlock(&tl_inode->i_mutex);
 
+	if (!status && checkpoint)
+		ocfs2_start_checkpoint(osb);
+
 	mlog_exit(status);
 	return status;
 }
@@ -6442,12 +6451,12 @@ static int ocfs2_free_cached_clusters(struct ocfs2_super *osb,
 	struct ocfs2_cached_block_free *tmp;
 	struct inode *tl_inode = osb->osb_tl_inode;
 	handle_t *handle;
-	int ret = 0;
+	int ret = 0, checkpoint = 0;
 
 	mutex_lock(&tl_inode->i_mutex);
 
 	while (head) {
-		if (ocfs2_truncate_log_needs_flush(osb)) {
+		if (ocfs2_truncate_log_needs_flush(osb, &checkpoint)) {
 			ret = __ocfs2_flush_truncate_log(osb);
 			if (ret < 0) {
 				mlog_errno(ret);
@@ -6485,6 +6494,8 @@ static int ocfs2_free_cached_clusters(struct ocfs2_super *osb,
 		kfree(tmp);
 	}
 
+	if (!ret && checkpoint)
+		ocfs2_start_checkpoint(osb);
 	return ret;
 }
 
diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h
index 55762b5..967ee21 100644
--- a/fs/ocfs2/alloc.h
+++ b/fs/ocfs2/alloc.h
@@ -182,7 +182,7 @@ int ocfs2_begin_truncate_log_recovery(struct ocfs2_super *osb,
 				      struct ocfs2_dinode **tl_copy);
 int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
 					 struct ocfs2_dinode *tl_copy);
-int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb);
+int ocfs2_truncate_log_needs_flush(struct ocfs2_super *osb, int *checkpoint);
 int ocfs2_truncate_log_append(struct ocfs2_super *osb,
 			      handle_t *handle,
 			      u64 start_blk,
diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index d03469f..82f1fa2 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3595,7 +3595,7 @@ int ocfs2_xattr_set(struct inode *inode,
 
 	mutex_lock(&tl_inode->i_mutex);
 
-	if (ocfs2_truncate_log_needs_flush(osb)) {
+	if (ocfs2_truncate_log_needs_flush(osb, NULL)) {
 		ret = __ocfs2_flush_truncate_log(osb);
 		if (ret < 0) {
 			mutex_unlock(&tl_inode->i_mutex);
@@ -5447,7 +5447,7 @@ static int ocfs2_rm_xattr_cluster(struct inode *inode,
 
 	mutex_lock(&tl_inode->i_mutex);
 
-	if (ocfs2_truncate_log_needs_flush(osb)) {
+	if (ocfs2_truncate_log_needs_flush(osb, NULL)) {
 		ret = __ocfs2_flush_truncate_log(osb);
 		if (ret < 0) {
 			mlog_errno(ret);
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush.
  2010-09-16  6:19 [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Tao Ma
                   ` (2 preceding siblings ...)
  2010-09-16  6:22 ` [Ocfs2-devel] [PATCH 3/3] ocfs2: Start journal checkpoint if we have too many truncated clusters Tao Ma
@ 2010-09-16  6:41 ` Joel Becker
  2010-09-16  8:51   ` Tao Ma
  3 siblings, 1 reply; 7+ messages in thread
From: Joel Becker @ 2010-09-16  6:41 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Sep 16, 2010 at 02:19:51PM +0800, Tao Ma wrote:
> 0002-0003 are a little complicate here.
> So in general, even we have freed the clusters to the global
> bitmap(by the above flush), we can't use them because they haven't
> been committed to the disk(check the comments above
> ocfs2_test_bg_bit_allocatable for details). So we have to tell jbd2
> to flush immediately.

Mark,
	Didn't we do something to allow our allocation to re-use these
clusters?  Or was that for "thought to be allocated but never actually
used" clusters?

> 0002 reverts some change in commit c271c5c so that now even in local
> mode we can force jbd2 to flush immediately.
> 0003 add the 'checkpoint' to ocfs2_truncate_log_needs_flush, so that
> the callers know that we need to checkpoint after the flush.
> 
> Actually 0003 can work without 0002 since in cluster env,
> ocfs2_start_checkpoint will work and 0002 is just used for a local
> mode. So feel free to drop 0002 if we decide to still keep ocfs2cmt
> out of local mode.

	Assuming we want to force jbd2 out, which is heavy as you point
out, can't we just sync the filesystem?  Wouldn't that work without
adding ocfs2cmt?

Joel

-- 

"I am working for the time when unqualified blacks, browns, and
 women join the unqualified men in running our government."
	- Sissy Farenthold

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush.
  2010-09-16  6:41 ` [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Joel Becker
@ 2010-09-16  8:51   ` Tao Ma
  2010-09-16 16:29     ` Joel Becker
  0 siblings, 1 reply; 7+ messages in thread
From: Tao Ma @ 2010-09-16  8:51 UTC (permalink / raw)
  To: ocfs2-devel

Hi Joel,

On 09/16/2010 02:41 PM, Joel Becker wrote:
> On Thu, Sep 16, 2010 at 02:19:51PM +0800, Tao Ma wrote:
>> 0002-0003 are a little complicate here.
>> So in general, even we have freed the clusters to the global
>> bitmap(by the above flush), we can't use them because they haven't
>> been committed to the disk(check the comments above
>> ocfs2_test_bg_bit_allocatable for details). So we have to tell jbd2
>> to flush immediately.
>
> Mark,
> 	Didn't we do something to allow our allocation to re-use these
> clusters?  Or was that for "thought to be allocated but never actually
> used" clusters?
At least from my test, we don't re-use them. If yes, my first patch 
should resolve the problem and no need for 0002 and 0003.
>
>> 0002 reverts some change in commit c271c5c so that now even in local
>> mode we can force jbd2 to flush immediately.
>> 0003 add the 'checkpoint' to ocfs2_truncate_log_needs_flush, so that
>> the callers know that we need to checkpoint after the flush.
>>
>> Actually 0003 can work without 0002 since in cluster env,
>> ocfs2_start_checkpoint will work and 0002 is just used for a local
>> mode. So feel free to drop 0002 if we decide to still keep ocfs2cmt
>> out of local mode.
>
> 	Assuming we want to force jbd2 out, which is heavy as you point
> out, can't we just sync the filesystem?  Wouldn't that work without
> adding ocfs2cmt?
I am just worried about the locking. My current solution call 
ocfs2_start_checkpoint which only wakeup checkpoint_event. We have no 
chance of locking and what's more, ocfs2cmt seems to work steadily for 
several years.

So maybe we can add a work to ocfs2_wq and let it do the 
ocfs2_sync_fs()? I am just afraid of blocking ocfs2_wq like what orphan 
scan did recently. You know that. ;)

Regards,
Tao

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush.
  2010-09-16  8:51   ` Tao Ma
@ 2010-09-16 16:29     ` Joel Becker
  0 siblings, 0 replies; 7+ messages in thread
From: Joel Becker @ 2010-09-16 16:29 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Sep 16, 2010 at 04:51:30PM +0800, Tao Ma wrote:
> >	Assuming we want to force jbd2 out, which is heavy as you point
> >out, can't we just sync the filesystem?  Wouldn't that work without
> >adding ocfs2cmt?
> I am just worried about the locking. My current solution call
> ocfs2_start_checkpoint which only wakeup checkpoint_event. We have
> no chance of locking and what's more, ocfs2cmt seems to work
> steadily for several years.

	I'd love to example the locking.  Assuming we are outside a
handle but probably inside i_mutex for the file we are growing, what
would the locking be like for ocfs2_sync_fs()?  Starting a handle and
waiting on jbd2 should absolutely be safe
(jbd2_journal_start_commit()/jbd2_log_wait_commit()).  It's flushing the
truncate log under another i_mutex.  I think it's gotta be safe, because
we take truncate_log->i_mutex under another inode's i_mutex when
truncating.

> So maybe we can add a work to ocfs2_wq and let it do the
> ocfs2_sync_fs()? I am just afraid of blocking ocfs2_wq like what
> orphan scan did recently. You know that. ;)

	I would be wary of that too.  I think the locking should work
inline, though.

Joel

-- 

None of our men are "experts."  We have most unfortunately found
it necessary to get rid of a man as soon as he thinks himself an
expert -- because no one ever considers himself expert if he really
knows his job.  A man who knows a job sees so much more to be done
than he has done, that he is always pressing forward and never
gives up an instant of thought to how good and how efficient he is.
Thinking always ahead, thinking always of trying to do more, brings
a state of mind in which nothing is impossible. The moment one gets
into the "expert" state of mind a great number of things become
impossible.
	- From Henry Ford Sr., "My Life and Work"

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-16 16:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-16  6:19 [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Tao Ma
2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 1/3] ocfs2: flush truncate log in case it contains too many clusters Tao Ma
2010-09-16  6:21 ` [Ocfs2-devel] [PATCH 2/3] ocfs2: Start ocfs2cmt even in local mode Tao Ma
2010-09-16  6:22 ` [Ocfs2-devel] [PATCH 3/3] ocfs2: Start journal checkpoint if we have too many truncated clusters Tao Ma
2010-09-16  6:41 ` [Ocfs2-devel] [PATCH 0/3] ocfs2: Resolve the problem of truncate log flush Joel Becker
2010-09-16  8:51   ` Tao Ma
2010-09-16 16:29     ` Joel Becker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.