All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH 0/3] Add inode steal for ocfs2.V2
@ 2008-02-27 22:36 Tao Ma
  2008-02-27 22:59 ` [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2 Tao Ma
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Tao Ma @ 2008-02-27 22:36 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,
This patch series add inode steal mechanism for inode allocation.

Modification from V1 to V2:
1. Add a new member in ocfs2_super which will record the slot we steal
   the inode successfully. This will speed up the next steal.
2. Modify the mechanism of inode stealing. Now we will start our search
   from the node next to it. 
3. Remove the last and extra inode allocation in the local slot since
   there is no evidence that we will have our global bitmap be emptied
   enough in the time it takes us to search the other allocators.

In OCFS2, we allocate the inodes from slot specific inode_alloc to avoid
inode creation congestion. The local alloc file grows in a large contiguous
chunk. As for a 4K bs, it grows 4M every time. So 1024 inodes will be
allocated at a time.

Over time, if the fs gets fragmented enough(e.g, the user has created many
small files and also delete some of them), we can end up in a situation,
whereby we cannot extend the inode_alloc as we don't have a large chunk
free in the global_bitmap even if df shows few gigs free. More annoying is
that this situation will invariably mean that while one cannot create inodes
on one node but can from another node. Still more annoying is that an unused
slot may have space for plenty of inodes but is unusable as the user may not
be mounting as many nodes anymore.

This patch series implement a solution which is to steal inodes from another
slot. Now the whole inode allocation process looks like this:
1. Allocate from its own inode_alloc:000X
   1) If we can reserve, OK.
   2) If fails, try to allocate a large chunk and reserve once again.
2. If 1 fails, try to allocate from the ocfs2_super->inode_steal_slot.
   This time, Just try to reserve, we don't go for global_bitmap if
   this inode also can't allocate the inode.
3. If 2 fails, try the node next until we reach that steal slot again.

ocfs2_super->inode_steal_slot is initalized as the node next to our own
slot. And once the inode stealing successes, we will refresh it with
the slot we steal inode from. It will also be reinitalized when the local
truncate log or local alloc recovery is flushed in which case the global
bitmap may be refreshed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2
  2008-02-27 22:36 [Ocfs2-devel] [PATCH 0/3] Add inode steal for ocfs2.V2 Tao Ma
@ 2008-02-27 22:59 ` Tao Ma
  2008-02-28 15:06   ` Mark Fasheh
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2 Tao Ma
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2 Tao Ma
  2 siblings, 1 reply; 10+ messages in thread
From: Tao Ma @ 2008-02-27 22:59 UTC (permalink / raw)
  To: ocfs2-devel

In some cases(Inode stealing from other nodes), we may not want
ocfs2_reserve_suballoc_bits to allocate new groups from the
global_bitmap since it may already be full. So add a new parameter
for this.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/suballoc.c |   22 ++++++++++++++++++----
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 72c198a..3be4e73 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -46,6 +46,9 @@
 
 #include "buffer_head_io.h"
 
+#define NOT_ALLOC_NEW_GROUP		0
+#define ALLOC_NEW_GROUP			1
+
 static inline void ocfs2_debug_bg(struct ocfs2_group_desc *bg);
 static inline void ocfs2_debug_suballoc_inode(struct ocfs2_dinode *fe);
 static inline u16 ocfs2_find_victim_chain(struct ocfs2_chain_list *cl);
@@ -391,7 +394,8 @@ bail:
 static int ocfs2_reserve_suballoc_bits(struct ocfs2_super *osb,
 				       struct ocfs2_alloc_context *ac,
 				       int type,
-				       u32 slot)
+				       u32 slot,
+				       int alloc_new_group)
 {
 	int status;
 	u32 bits_wanted = ac->ac_bits_wanted;
@@ -446,6 +450,14 @@ static int ocfs2_reserve_suballoc_bits(struct ocfs2_super *osb,
 			goto bail;
 		}
 
+		if (alloc_new_group != ALLOC_NEW_GROUP) {
+			mlog(0, "Alloc File %u Full: wanted=%u, free_bits=%u, "
+			     "and we don't alloc a new group for it.\n",
+			     slot, bits_wanted, free_bits);
+			status = -ENOSPC;
+			goto bail;
+		}
+
 		status = ocfs2_block_group_alloc(osb, alloc_inode, bh);
 		if (status < 0) {
 			if (status != -ENOSPC)
@@ -490,7 +502,8 @@ int ocfs2_reserve_new_metadata(struct ocfs2_super *osb,
 	(*ac)->ac_group_search = ocfs2_block_group_search;
 
 	status = ocfs2_reserve_suballoc_bits(osb, (*ac),
-					     EXTENT_ALLOC_SYSTEM_INODE, slot);
+					     EXTENT_ALLOC_SYSTEM_INODE,
+					     slot, ALLOC_NEW_GROUP);
 	if (status < 0) {
 		if (status != -ENOSPC)
 			mlog_errno(status);
@@ -527,7 +540,7 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
 
 	status = ocfs2_reserve_suballoc_bits(osb, *ac,
 					     INODE_ALLOC_SYSTEM_INODE,
-					     osb->slot_num);
+					     osb->slot_num, ALLOC_NEW_GROUP);
 	if (status < 0) {
 		if (status != -ENOSPC)
 			mlog_errno(status);
@@ -557,7 +570,8 @@ int ocfs2_reserve_cluster_bitmap_bits(struct ocfs2_super *osb,
 
 	status = ocfs2_reserve_suballoc_bits(osb, ac,
 					     GLOBAL_BITMAP_SYSTEM_INODE,
-					     OCFS2_INVALID_SLOT);
+					     OCFS2_INVALID_SLOT,
+					     ALLOC_NEW_GROUP);
 	if (status < 0 && status != -ENOSPC) {
 		mlog_errno(status);
 		goto bail;
-- 
1.5.3.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2
  2008-02-27 22:36 [Ocfs2-devel] [PATCH 0/3] Add inode steal for ocfs2.V2 Tao Ma
  2008-02-27 22:59 ` [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2 Tao Ma
@ 2008-02-27 23:01 ` Tao Ma
  2008-02-28 15:08   ` Mark Fasheh
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2 Tao Ma
  2 siblings, 1 reply; 10+ messages in thread
From: Tao Ma @ 2008-02-27 23:01 UTC (permalink / raw)
  To: ocfs2-devel

In inode stealing, we no longer restrict the allocation to
happen in the local node. So it is neccessary for us to add
a new member in ocfs2_alloc_context to indicate which slot
we are using for allocation. We also modify the process of
local alloc so that this member can be used there also.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/localalloc.c |    1 +
 fs/ocfs2/suballoc.c   |    1 +
 fs/ocfs2/suballoc.h   |    1 +
 3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
index add1ffd..250b4bc 100644
--- a/fs/ocfs2/localalloc.c
+++ b/fs/ocfs2/localalloc.c
@@ -526,6 +526,7 @@ int ocfs2_reserve_local_alloc_bits(struct ocfs2_super *osb,
 	}
 
 	ac->ac_inode = local_alloc_inode;
+	ac->ac_alloc_slot = osb->slot_num;
 	ac->ac_which = OCFS2_AC_USE_LOCAL;
 	get_bh(osb->local_alloc_bh);
 	ac->ac_bh = osb->local_alloc_bh;
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 3be4e73..33d5573 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -424,6 +424,7 @@ static int ocfs2_reserve_suballoc_bits(struct ocfs2_super *osb,
 	}
 
 	ac->ac_inode = alloc_inode;
+	ac->ac_alloc_slot = slot;
 
 	fe = (struct ocfs2_dinode *) bh->b_data;
 	if (!OCFS2_IS_VALID_DINODE(fe)) {
diff --git a/fs/ocfs2/suballoc.h b/fs/ocfs2/suballoc.h
index 8799033..544c600 100644
--- a/fs/ocfs2/suballoc.h
+++ b/fs/ocfs2/suballoc.h
@@ -36,6 +36,7 @@ typedef int (group_search_t)(struct inode *,
 struct ocfs2_alloc_context {
 	struct inode *ac_inode;    /* which bitmap are we allocating from? */
 	struct buffer_head *ac_bh; /* file entry bh */
+	u32    ac_alloc_slot;   /* which slot are we allocating from? */
 	u32    ac_bits_wanted;
 	u32    ac_bits_given;
 #define OCFS2_AC_USE_LOCAL 1
-- 
1.5.3.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2
  2008-02-27 22:36 [Ocfs2-devel] [PATCH 0/3] Add inode steal for ocfs2.V2 Tao Ma
  2008-02-27 22:59 ` [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2 Tao Ma
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2 Tao Ma
@ 2008-02-27 23:01 ` Tao Ma
  2008-02-28 15:31   ` Mark Fasheh
  2 siblings, 1 reply; 10+ messages in thread
From: Tao Ma @ 2008-02-27 23:01 UTC (permalink / raw)
  To: ocfs2-devel

Add inode stealing for ocfs2_reserve_new_inode. Now the whole process is:
1. Allocate from its own inode_alloc:000X
   1) If we can reserve, OK.
   2) If fails, try to allocate a large chunk and reserve once again.
2. If 1 fails, try to allocate from the ocfs2_super->inode_steal_slot.
   This time, Just try to reserve, we don't go for global_bitmap if
   this inode also can't allocate the inode.
3. If 2 fails, try the node next until we reach that steal slot again.

ocfs2_super->inode_steal_slot is initalized as the node next to our own
slot. And once the inode stealing successes, we will refresh it with
the slot we steal inode from. It will also be reinitalized when the local
truncate log or local alloc recovery is flushed in which case the global
bitmap may be refreshed.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
---
 fs/ocfs2/alloc.c      |    2 +
 fs/ocfs2/localalloc.c |    2 +
 fs/ocfs2/namei.c      |    2 +-
 fs/ocfs2/ocfs2.h      |   36 +++++++++++++++++++++++++++++++++-
 fs/ocfs2/suballoc.c   |   50 +++++++++++++++++++++++++++++++++++++++++++++++-
 fs/ocfs2/super.c      |    1 +
 6 files changed, 88 insertions(+), 5 deletions(-)

diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 447206e..f333cdc 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -4788,6 +4788,8 @@ static void ocfs2_truncate_log_worker(struct work_struct *work)
 	status = ocfs2_flush_truncate_log(osb);
 	if (status < 0)
 		mlog_errno(status);
+	else
+		ocfs2_init_inode_steal_slot(osb);
 
 	mlog_exit(status);
 }
diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
index 250b4bc..1965162 100644
--- a/fs/ocfs2/localalloc.c
+++ b/fs/ocfs2/localalloc.c
@@ -450,6 +450,8 @@ out_mutex:
 	iput(main_bm_inode);
 
 out:
+	if (!status)
+		ocfs2_init_inode_steal_slot(osb);
 	mlog_exit(status);
 	return status;
 }
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index ae9ad95..ab5a227 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -424,7 +424,7 @@ static int ocfs2_mknod_locked(struct ocfs2_super *osb,
 	fe->i_fs_generation = cpu_to_le32(osb->fs_generation);
 	fe->i_blkno = cpu_to_le64(fe_blkno);
 	fe->i_suballoc_bit = cpu_to_le16(suballoc_bit);
-	fe->i_suballoc_slot = cpu_to_le16(osb->slot_num);
+	fe->i_suballoc_slot = cpu_to_le16(inode_ac->ac_alloc_slot);
 	fe->i_uid = cpu_to_le32(current->fsuid);
 	if (dir->i_mode & S_ISGID) {
 		fe->i_gid = cpu_to_le32(dir->i_gid);
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 6546cef..75b7fe0 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -206,11 +206,13 @@ struct ocfs2_super
 	u32 s_feature_incompat;
 	u32 s_feature_ro_compat;
 
-	/* Protects s_next_generaion, osb_flags. Could protect more on
-	 * osb as it's very short lived. */
+	/* Protects s_next_generation, osb_flags and inode_steal_slot.
+	 * Could protect more on osb as it's very short lived.
+	 */
 	spinlock_t osb_lock;
 	u32 s_next_generation;
 	unsigned long osb_flags;
+	s16 inode_steal_slot;
 
 	unsigned long s_mount_opt;
 	unsigned int s_atime_quantum;
@@ -522,6 +524,36 @@ static inline unsigned int ocfs2_pages_per_cluster(struct super_block *sb)
 	return pages_per_cluster;
 }
 
+static inline void ocfs2_init_inode_steal_slot(struct ocfs2_super *osb)
+{
+	BUG_ON(osb->slot_num == OCFS2_INVALID_SLOT);
+
+	spin_lock(&osb->osb_lock);
+	osb->inode_steal_slot = osb->slot_num + 1;
+	if (osb->inode_steal_slot == osb->max_slots)
+		osb->inode_steal_slot = 0;
+	spin_unlock(&osb->osb_lock);
+}
+
+static inline void ocfs2_set_inode_steal_slot(struct ocfs2_super *osb,
+					      u16 slot)
+{
+	spin_lock(&osb->osb_lock);
+	osb->inode_steal_slot = slot;
+	spin_unlock(&osb->osb_lock);
+}
+
+static inline u16 ocfs2_get_inode_steal_slot(struct ocfs2_super *osb)
+{
+	s16 slot;
+
+	spin_lock(&osb->osb_lock);
+	slot = osb->inode_steal_slot;
+	spin_unlock(&osb->osb_lock);
+
+	return slot;
+}
+
 #define ocfs2_set_bit ext2_set_bit
 #define ocfs2_clear_bit ext2_clear_bit
 #define ocfs2_test_bit ext2_test_bit
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 33d5573..1657b7f 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -109,7 +109,7 @@ static inline void ocfs2_block_to_cluster_group(struct inode *inode,
 						u64 *bg_blkno,
 						u16 *bg_bit_off);
 
-void ocfs2_free_alloc_context(struct ocfs2_alloc_context *ac)
+static void ocfs2_free_ac_resource(struct ocfs2_alloc_context *ac)
 {
 	struct inode *inode = ac->ac_inode;
 
@@ -120,9 +120,17 @@ void ocfs2_free_alloc_context(struct ocfs2_alloc_context *ac)
 		mutex_unlock(&inode->i_mutex);
 
 		iput(inode);
+		ac->ac_inode = NULL;
 	}
-	if (ac->ac_bh)
+	if (ac->ac_bh) {
 		brelse(ac->ac_bh);
+		ac->ac_bh = NULL;
+	}
+}
+
+void ocfs2_free_alloc_context(struct ocfs2_alloc_context *ac)
+{
+	ocfs2_free_ac_resource(ac);
 	kfree(ac);
 }
 
@@ -522,6 +530,33 @@ bail:
 	return status;
 }
 
+static int ocfs2_steal_inode_from_other_nodes(struct ocfs2_super *osb,
+					      struct ocfs2_alloc_context *ac)
+{
+	int status = -ENOSPC, i;
+	s16 slot = ocfs2_get_inode_steal_slot(osb);
+
+	for (i = 0; i < osb->max_slots; i++, slot++) {
+		if (slot == osb->max_slots)
+			slot = 0;
+
+		if (slot == osb->slot_num)
+			continue;
+
+		status = ocfs2_reserve_suballoc_bits(osb, ac,
+						     INODE_ALLOC_SYSTEM_INODE,
+						     slot, NOT_ALLOC_NEW_GROUP);
+		if (status >= 0) {
+			ocfs2_set_inode_steal_slot(osb, slot);
+			break;
+		}
+
+		ocfs2_free_ac_resource(ac);
+	}
+
+	return status;
+}
+
 int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
 			    struct ocfs2_alloc_context **ac)
 {
@@ -542,6 +577,17 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
 	status = ocfs2_reserve_suballoc_bits(osb, *ac,
 					     INODE_ALLOC_SYSTEM_INODE,
 					     osb->slot_num, ALLOC_NEW_GROUP);
+	if (status >= 0) {
+		status = 0;
+		goto bail;
+	} else if (status < 0 && status != -ENOSPC) {
+		mlog_errno(status);
+		goto bail;
+	}
+
+	ocfs2_free_ac_resource(*ac);
+
+	status = ocfs2_steal_inode_from_other_nodes(osb, *ac);
 	if (status < 0) {
 		if (status != -ENOSPC)
 			mlog_errno(status);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index bec75af..c4e82c7 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1193,6 +1193,7 @@ static int ocfs2_mount_volume(struct super_block *sb)
 		mlog_errno(status);
 		goto leave;
 	}
+	ocfs2_init_inode_steal_slot(osb);
 
 	/* load all node-local system inodes */
 	status = ocfs2_init_local_system_inodes(osb);
-- 
1.5.3.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2
  2008-02-27 22:59 ` [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2 Tao Ma
@ 2008-02-28 15:06   ` Mark Fasheh
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Fasheh @ 2008-02-28 15:06 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Feb 28, 2008 at 02:59:04PM +0800, tao.ma wrote:
> In some cases(Inode stealing from other nodes), we may not want
> ocfs2_reserve_suballoc_bits to allocate new groups from the
> global_bitmap since it may already be full. So add a new parameter
> for this.
> 
> Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2 Tao Ma
@ 2008-02-28 15:08   ` Mark Fasheh
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Fasheh @ 2008-02-28 15:08 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Feb 28, 2008 at 02:59:48PM +0800, tao.ma wrote:
> In inode stealing, we no longer restrict the allocation to
> happen in the local node. So it is neccessary for us to add
> a new member in ocfs2_alloc_context to indicate which slot
> we are using for allocation. We also modify the process of
> local alloc so that this member can be used there also.
> 
> Signed-off-by: Tao Ma <tao.ma@oracle.com>
> ---
>  fs/ocfs2/localalloc.c |    1 +
>  fs/ocfs2/suballoc.c   |    1 +
>  fs/ocfs2/suballoc.h   |    1 +
>  3 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
> index add1ffd..250b4bc 100644
> --- a/fs/ocfs2/localalloc.c
> +++ b/fs/ocfs2/localalloc.c
> @@ -526,6 +526,7 @@ int ocfs2_reserve_local_alloc_bits(struct ocfs2_super *osb,
>  	}
>  
>  	ac->ac_inode = local_alloc_inode;
> +	ac->ac_alloc_slot = osb->slot_num;

Put a comment above this line:

	/* We should never use localalloc from another slot */

I don't want someone to accidentally change that in the future.
	--Mark

--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2
  2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2 Tao Ma
@ 2008-02-28 15:31   ` Mark Fasheh
  2008-02-28 17:11     ` tao.ma
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Fasheh @ 2008-02-28 15:31 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Feb 28, 2008 at 03:00:26PM +0800, tao.ma wrote:
> Add inode stealing for ocfs2_reserve_new_inode. Now the whole process is:
> 1. Allocate from its own inode_alloc:000X
>    1) If we can reserve, OK.
>    2) If fails, try to allocate a large chunk and reserve once again.
> 2. If 1 fails, try to allocate from the ocfs2_super->inode_steal_slot.
>    This time, Just try to reserve, we don't go for global_bitmap if
>    this inode also can't allocate the inode.
> 3. If 2 fails, try the node next until we reach that steal slot again.
> 
> ocfs2_super->inode_steal_slot is initalized as the node next to our own
> slot. And once the inode stealing successes, we will refresh it with
> the slot we steal inode from.

> It will also be reinitalized when the local
> truncate log or local alloc recovery is flushed in which case the global
> bitmap may be refreshed.

How about when we free an inode from our slots allocator?


> diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
> index 6546cef..75b7fe0 100644
> --- a/fs/ocfs2/ocfs2.h
> +++ b/fs/ocfs2/ocfs2.h
> @@ -206,11 +206,13 @@ struct ocfs2_super
>  	u32 s_feature_incompat;
>  	u32 s_feature_ro_compat;
>  
> -	/* Protects s_next_generaion, osb_flags. Could protect more on
> -	 * osb as it's very short lived. */
> +	/* Protects s_next_generation, osb_flags and inode_steal_slot.
> +	 * Could protect more on osb as it's very short lived.
> +	 */
>  	spinlock_t osb_lock;
>  	u32 s_next_generation;
>  	unsigned long osb_flags;
> +	s16 inode_steal_slot;

Please prefix this with "s_" or "osb_".


>  	unsigned long s_mount_opt;
>  	unsigned int s_atime_quantum;
> @@ -522,6 +524,36 @@ static inline unsigned int ocfs2_pages_per_cluster(struct super_block *sb)
>  	return pages_per_cluster;
>  }
 
> +static inline void ocfs2_init_inode_steal_slot(struct ocfs2_super *osb)
> +{
> +	BUG_ON(osb->slot_num == OCFS2_INVALID_SLOT);
> +
> +	spin_lock(&osb->osb_lock);
> +	osb->inode_steal_slot = osb->slot_num + 1;
> +	if (osb->inode_steal_slot == osb->max_slots)
> +		osb->inode_steal_slot = 0;
> +	spin_unlock(&osb->osb_lock);
> +}

We probably don't want to inline this.


>  int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>  			    struct ocfs2_alloc_context **ac)
>  {
> @@ -542,6 +577,17 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>  	status = ocfs2_reserve_suballoc_bits(osb, *ac,
>  					     INODE_ALLOC_SYSTEM_INODE,
>  					     osb->slot_num, ALLOC_NEW_GROUP);
> +	if (status >= 0) {
> +		status = 0;
> +		goto bail;
> +	} else if (status < 0 && status != -ENOSPC) {
> +		mlog_errno(status);
> +		goto bail;
> +	}
> +
> +	ocfs2_free_ac_resource(*ac);
> +
> +	status = ocfs2_steal_inode_from_other_nodes(osb, *ac);

Does this mean we always search our own first, even if we know it's not
likely to have anything in it? Wouldn't it be better to ignore once it's
full until we get to one of the spots where you've inserted a call to
ocfs2_init_inode_steal_slot)?
	--Mark

--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2
  2008-02-28 15:31   ` Mark Fasheh
@ 2008-02-28 17:11     ` tao.ma
  2008-02-28 17:42       ` Mark Fasheh
  0 siblings, 1 reply; 10+ messages in thread
From: tao.ma @ 2008-02-28 17:11 UTC (permalink / raw)
  To: ocfs2-devel

Mark Fasheh wrote:
> On Thu, Feb 28, 2008 at 03:00:26PM +0800, tao.ma wrote:
>> Add inode stealing for ocfs2_reserve_new_inode. Now the whole process is:
>> 1. Allocate from its own inode_alloc:000X
>>    1) If we can reserve, OK.
>>    2) If fails, try to allocate a large chunk and reserve once again.
>> 2. If 1 fails, try to allocate from the ocfs2_super->inode_steal_slot.
>>    This time, Just try to reserve, we don't go for global_bitmap if
>>    this inode also can't allocate the inode.
>> 3. If 2 fails, try the node next until we reach that steal slot again.
>>
>> ocfs2_super->inode_steal_slot is initalized as the node next to our own
>> slot. And once the inode stealing successes, we will refresh it with
>> the slot we steal inode from.
> 
>> It will also be reinitalized when the local
>> truncate log or local alloc recovery is flushed in which case the global
>> bitmap may be refreshed.
> 
> How about when we free an inode from our slots allocator?
As I always start to allocate from my own slot first, so this should not 
be a problem since if we free an inode, this block can be allocated 
again to the new request.
>>  int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>>  			    struct ocfs2_alloc_context **ac)
>>  {
>> @@ -542,6 +577,17 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>>  	status = ocfs2_reserve_suballoc_bits(osb, *ac,
>>  					     INODE_ALLOC_SYSTEM_INODE,
>>  					     osb->slot_num, ALLOC_NEW_GROUP);
>> +	if (status >= 0) {
>> +		status = 0;
>> +		goto bail;
>> +	} else if (status < 0 && status != -ENOSPC) {
>> +		mlog_errno(status);
>> +		goto bail;
>> +	}
>> +
>> +	ocfs2_free_ac_resource(*ac);
>> +
>> +	status = ocfs2_steal_inode_from_other_nodes(osb, *ac);
> 
> Does this mean we always search our own first, even if we know it's not
> likely to have anything in it? Wouldn't it be better to ignore once it's
> full until we get to one of the spots where you've inserted a call to
> ocfs2_init_inode_steal_slot)?
I am worried about the situation that the global bitmap is flushed by 
other nodes and how the current node notice this and use its own local 
allocator again. Or should it notice this?
Another thing is that what if the inode which is owned by this slot 
deleted by other slots? We should have the ability to allocate the inode 
from our own slot now.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2
  2008-02-28 17:11     ` tao.ma
@ 2008-02-28 17:42       ` Mark Fasheh
  2008-02-28 21:17         ` tao.ma
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Fasheh @ 2008-02-28 17:42 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Feb 29, 2008 at 09:10:12AM +0800, tao.ma wrote:
>>> @@ -542,6 +577,17 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>>>  	status = ocfs2_reserve_suballoc_bits(osb, *ac,
>>>  					     INODE_ALLOC_SYSTEM_INODE,
>>>  					     osb->slot_num, ALLOC_NEW_GROUP);
>>> +	if (status >= 0) {
>>> +		status = 0;
>>> +		goto bail;
>>> +	} else if (status < 0 && status != -ENOSPC) {
>>> +		mlog_errno(status);
>>> +		goto bail;
>>> +	}
>>> +
>>> +	ocfs2_free_ac_resource(*ac);
>>> +
>>> +	status = ocfs2_steal_inode_from_other_nodes(osb, *ac);
>> Does this mean we always search our own first, even if we know it's not
>> likely to have anything in it? Wouldn't it be better to ignore once it's
>> full until we get to one of the spots where you've inserted a call to
>> ocfs2_init_inode_steal_slot)?
> I am worried about the situation that the global bitmap is flushed by other 
> nodes and how the current node notice this and use its own local allocator 
> again. Or should it notice this?
> Another thing is that what if the inode which is owned by this slot deleted 
> by other slots? We should have the ability to allocate the inode from our 
> own slot now.

Both your points come down to "how do we know when another node has changed
the allocators such that we can now allocate inodes from our local
inode_alloc file."

Unfortunately, we have no way of knowing what another node does. I'm mostly 
worried about the inefficency of continuously searching ours when there
isn't likely any room left in it.

One thing we could do is only go back to our local allocator after some time
has passed (or some number of allocs). That way we don't check it every
single time, but we never completely leave it out of the picture either.
	--Mark

--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2
  2008-02-28 17:42       ` Mark Fasheh
@ 2008-02-28 21:17         ` tao.ma
  0 siblings, 0 replies; 10+ messages in thread
From: tao.ma @ 2008-02-28 21:17 UTC (permalink / raw)
  To: ocfs2-devel

Mark Fasheh wrote:
> On Fri, Feb 29, 2008 at 09:10:12AM +0800, tao.ma wrote:
>>>> @@ -542,6 +577,17 @@ int ocfs2_reserve_new_inode(struct ocfs2_super *osb,
>>>>  	status = ocfs2_reserve_suballoc_bits(osb, *ac,
>>>>  					     INODE_ALLOC_SYSTEM_INODE,
>>>>  					     osb->slot_num, ALLOC_NEW_GROUP);
>>>> +	if (status >= 0) {
>>>> +		status = 0;
>>>> +		goto bail;
>>>> +	} else if (status < 0 && status != -ENOSPC) {
>>>> +		mlog_errno(status);
>>>> +		goto bail;
>>>> +	}
>>>> +
>>>> +	ocfs2_free_ac_resource(*ac);
>>>> +
>>>> +	status = ocfs2_steal_inode_from_other_nodes(osb, *ac);
>>> Does this mean we always search our own first, even if we know it's not
>>> likely to have anything in it? Wouldn't it be better to ignore once it's
>>> full until we get to one of the spots where you've inserted a call to
>>> ocfs2_init_inode_steal_slot)?
>> I am worried about the situation that the global bitmap is flushed by other 
>> nodes and how the current node notice this and use its own local allocator 
>> again. Or should it notice this?
>> Another thing is that what if the inode which is owned by this slot deleted 
>> by other slots? We should have the ability to allocate the inode from our 
>> own slot now.
> 
> Both your points come down to "how do we know when another node has changed
> the allocators such that we can now allocate inodes from our local
> inode_alloc file."
> 
> Unfortunately, we have no way of knowing what another node does. I'm mostly 
> worried about the inefficency of continuously searching ours when there
> isn't likely any room left in it.
> 
> One thing we could do is only go back to our local allocator after some time
> has passed (or some number of allocs). That way we don't check it every
> single time, but we never completely leave it out of the picture either.
OK, so I will modify it. thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-02-28 21:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-27 22:36 [Ocfs2-devel] [PATCH 0/3] Add inode steal for ocfs2.V2 Tao Ma
2008-02-27 22:59 ` [Ocfs2-devel] [PATCH 1/3] Add a new parameter for ocfs2_reserve_suballoc_bits.V2 Tao Ma
2008-02-28 15:06   ` Mark Fasheh
2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 2/3] Add ac_alloc_slot in ocfs2_alloc_context.V2 Tao Ma
2008-02-28 15:08   ` Mark Fasheh
2008-02-27 23:01 ` [Ocfs2-devel] [PATCH 3/3] Add inode stealing for ocfs2_reserve_new_inode.V2 Tao Ma
2008-02-28 15:31   ` Mark Fasheh
2008-02-28 17:11     ` tao.ma
2008-02-28 17:42       ` Mark Fasheh
2008-02-28 21:17         ` tao.ma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.