From: Mark Fasheh <mark.fasheh@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com,
tao.ma@oracle.com
Subject: [Ocfs2-devel] Re: [PATCH 13/30] ocfs2: Implement group add for online resize
Date: Wed Jan 23 18:48:51 2008 [thread overview]
Message-ID: <20080124024844.GP23506@ca-server1.us.oracle.com> (raw)
In-Reply-To: <20080123140554.bf3981e5.akpm@linux-foundation.org>
On Wed, Jan 23, 2008 at 02:05:54PM -0800, Andrew Morton wrote:
> > On Thu, 17 Jan 2008 14:35:39 -0800 Mark Fasheh <mark.fasheh@oracle.com> wrote:
> > From: Tao Ma <tao.ma@oracle.com>
> >
> > This patch adds the ability for a userspace program to request that a
> > properly formatted cluster group be added to the main allocation bitmap for
> > an Ocfs2 file system. The request is made via an ioctl, OCFS2_IOC_GROUP_ADD.
> > On a high level, this is similar to ext3, but we use a different ioctl as
> > the structure which has to be passed through is different.
> >
> > During an online resize, tunefs.ocfs2 will format any new cluster groups
> > which must be added to complete the resize, and call OCFS2_IOC_GROUP_ADD on
> > each one. Kernel verifies that the core cluster group information is valid
> > and then does the work of linking it into the global allocation bitmap.
> >
> > ...
> >
> > +/* Used to pass group descriptor data when online resize is done */
> > +struct ocfs2_new_group_input {
> > + __u64 group; /* Group descriptor's blkno. */
> > + __u32 clusters; /* Total number of clusters in this group */
> > + __u32 frees; /* Total free clusters in this group */
> > + __u16 chain; /* Chain for this group */
> > + __u16 reserved1;
> > + __u32 reserved2;
> > +};
>
> Are we sure that all architectures will lay this out in the same way with
> both 32-bit and 64-bit userspace?
I looked it over several times and haven't been able to find a problem -
everything is aligned at 8 byte boundaries. Do you see anything that might
be problematic?
> > +/* Add a new group descriptor to global_bitmap. */
> > +int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input)
> > +{
> > + int ret;
> > + handle_t *handle;
> > + struct buffer_head *main_bm_bh = NULL;
> > + struct inode *main_bm_inode = NULL;
> > + struct ocfs2_dinode *fe = NULL;
> > + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> > + struct buffer_head *group_bh = NULL;
> > + struct ocfs2_group_desc *group = NULL;
> > + struct ocfs2_chain_list *cl;
> > + struct ocfs2_chain_rec *cr;
> > + u16 cl_bpc;
> > +
> > + mlog_entry_void();
> > +
> > + if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
> > + return -EROFS;
> > +
> > + main_bm_inode = ocfs2_get_system_file_inode(osb,
> > + GLOBAL_BITMAP_SYSTEM_INODE,
> > + OCFS2_INVALID_SLOT);
> > + if (!main_bm_inode) {
> > + ret = -EINVAL;
> > + mlog_errno(ret);
> > + goto out;
> > + }
> > +
> > + mutex_lock(&main_bm_inode->i_mutex);
> > +
> > + ret = ocfs2_inode_lock(main_bm_inode, &main_bm_bh, 1);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_mutex;
> > + }
> > +
> > + fe = (struct ocfs2_dinode *)main_bm_bh->b_data;
> > +
> > + if (le16_to_cpu(fe->id2.i_chain.cl_cpg) !=
> > + ocfs2_group_bitmap_size(osb->sb) * 8) {
> > + mlog(ML_ERROR, "The disk is too old and small."
> > + " Force to do offline resize.");
> > + ret = -EINVAL;
> > + goto out_unlock;
> > + }
> > +
> > + ret = ocfs2_read_block(osb, input->group, &group_bh, 0, NULL);
> > + if (ret < 0) {
> > + mlog(ML_ERROR, "Can't read the group descriptor # %llu "
> > + "from the device.", input->group);
> > + goto out;
>
> Bug: goto wrong_place. (Points at fault-injection code)
Ooof, good catch - thanks. A fix for this is attached to this e-mail, and
of course will be in ocfs2.git.
> > + }
> > +
> > + ocfs2_set_new_buffer_uptodate(inode, group_bh);
> > +
> > + ret = ocfs2_verify_group_and_input(main_bm_inode, fe, input, group_bh);
> > + if (ret) {
> > + mlog_errno(ret);
> > + goto out_unlock;
> > + }
> > +
> > + mlog(0, "Add a new group %llu in chain = %u, length = %u\n",
> > + input->group, input->chain, input->clusters);
> > +
> > + handle = ocfs2_start_trans(osb, OCFS2_GROUP_ADD_CREDITS);
> > + if (IS_ERR(handle)) {
> > + mlog_errno(PTR_ERR(handle));
> > + ret = -EINVAL;
> > + goto out_unlock;
> > + }
> > +
> > + cl_bpc = le16_to_cpu(fe->id2.i_chain.cl_bpc);
> > + cl = &fe->id2.i_chain;
> > + cr = &cl->cl_recs[input->chain];
> > +
> > + ret = ocfs2_journal_access(handle, main_bm_inode, group_bh,
> > + OCFS2_JOURNAL_ACCESS_WRITE);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_commit;
> > + }
> > +
> > + group = (struct ocfs2_group_desc *)group_bh->b_data;
> > + group->bg_next_group = cr->c_blkno;
> > +
> > + ret = ocfs2_journal_dirty(handle, group_bh);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_commit;
> > + }
> > +
> > + ret = ocfs2_journal_access(handle, main_bm_inode, main_bm_bh,
> > + OCFS2_JOURNAL_ACCESS_WRITE);
>
> hm. Do we need to do that again?
I think you're missing that those are two different buffers :)
> > + spin_lock(&OCFS2_I(main_bm_inode)->ip_lock);
> > + OCFS2_I(main_bm_inode)->ip_clusters = le32_to_cpu(fe->i_clusters);
> > + le64_add_cpu(&fe->i_size, input->clusters << osb->s_clustersize_bits);
> > + spin_unlock(&OCFS2_I(main_bm_inode)->ip_lock);
> > + i_size_write(main_bm_inode, le64_to_cpu(fe->i_size));
>
> Is i_mutex held?
Yes.
--Mark
--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com
From: Mark Fasheh <mark.fasheh@oracle.com>
ocfs2: fix goto error in ocfs2_group_add()
We were jumping to the wrong label on read error which would have caused us
to exit the function with locks held.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
---
fs/ocfs2/resize.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/resize.c b/fs/ocfs2/resize.c
index a103698..261dabf 100644
--- a/fs/ocfs2/resize.c
+++ b/fs/ocfs2/resize.c
@@ -544,7 +544,7 @@ int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input)
if (ret < 0) {
mlog(ML_ERROR, "Can't read the group descriptor # %llu "
"from the device.", (unsigned long long)input->group);
- goto out;
+ goto out_unlock;
}
ocfs2_set_new_buffer_uptodate(inode, group_bh);
--
1.5.3.6
WARNING: multiple messages have this Message-ID (diff)
From: Mark Fasheh <mark.fasheh@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com,
tao.ma@oracle.com
Subject: Re: [PATCH 13/30] ocfs2: Implement group add for online resize
Date: Wed, 23 Jan 2008 18:48:44 -0800 [thread overview]
Message-ID: <20080124024844.GP23506@ca-server1.us.oracle.com> (raw)
In-Reply-To: <20080123140554.bf3981e5.akpm@linux-foundation.org>
On Wed, Jan 23, 2008 at 02:05:54PM -0800, Andrew Morton wrote:
> > On Thu, 17 Jan 2008 14:35:39 -0800 Mark Fasheh <mark.fasheh@oracle.com> wrote:
> > From: Tao Ma <tao.ma@oracle.com>
> >
> > This patch adds the ability for a userspace program to request that a
> > properly formatted cluster group be added to the main allocation bitmap for
> > an Ocfs2 file system. The request is made via an ioctl, OCFS2_IOC_GROUP_ADD.
> > On a high level, this is similar to ext3, but we use a different ioctl as
> > the structure which has to be passed through is different.
> >
> > During an online resize, tunefs.ocfs2 will format any new cluster groups
> > which must be added to complete the resize, and call OCFS2_IOC_GROUP_ADD on
> > each one. Kernel verifies that the core cluster group information is valid
> > and then does the work of linking it into the global allocation bitmap.
> >
> > ...
> >
> > +/* Used to pass group descriptor data when online resize is done */
> > +struct ocfs2_new_group_input {
> > + __u64 group; /* Group descriptor's blkno. */
> > + __u32 clusters; /* Total number of clusters in this group */
> > + __u32 frees; /* Total free clusters in this group */
> > + __u16 chain; /* Chain for this group */
> > + __u16 reserved1;
> > + __u32 reserved2;
> > +};
>
> Are we sure that all architectures will lay this out in the same way with
> both 32-bit and 64-bit userspace?
I looked it over several times and haven't been able to find a problem -
everything is aligned at 8 byte boundaries. Do you see anything that might
be problematic?
> > +/* Add a new group descriptor to global_bitmap. */
> > +int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input)
> > +{
> > + int ret;
> > + handle_t *handle;
> > + struct buffer_head *main_bm_bh = NULL;
> > + struct inode *main_bm_inode = NULL;
> > + struct ocfs2_dinode *fe = NULL;
> > + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> > + struct buffer_head *group_bh = NULL;
> > + struct ocfs2_group_desc *group = NULL;
> > + struct ocfs2_chain_list *cl;
> > + struct ocfs2_chain_rec *cr;
> > + u16 cl_bpc;
> > +
> > + mlog_entry_void();
> > +
> > + if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
> > + return -EROFS;
> > +
> > + main_bm_inode = ocfs2_get_system_file_inode(osb,
> > + GLOBAL_BITMAP_SYSTEM_INODE,
> > + OCFS2_INVALID_SLOT);
> > + if (!main_bm_inode) {
> > + ret = -EINVAL;
> > + mlog_errno(ret);
> > + goto out;
> > + }
> > +
> > + mutex_lock(&main_bm_inode->i_mutex);
> > +
> > + ret = ocfs2_inode_lock(main_bm_inode, &main_bm_bh, 1);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_mutex;
> > + }
> > +
> > + fe = (struct ocfs2_dinode *)main_bm_bh->b_data;
> > +
> > + if (le16_to_cpu(fe->id2.i_chain.cl_cpg) !=
> > + ocfs2_group_bitmap_size(osb->sb) * 8) {
> > + mlog(ML_ERROR, "The disk is too old and small."
> > + " Force to do offline resize.");
> > + ret = -EINVAL;
> > + goto out_unlock;
> > + }
> > +
> > + ret = ocfs2_read_block(osb, input->group, &group_bh, 0, NULL);
> > + if (ret < 0) {
> > + mlog(ML_ERROR, "Can't read the group descriptor # %llu "
> > + "from the device.", input->group);
> > + goto out;
>
> Bug: goto wrong_place. (Points at fault-injection code)
Ooof, good catch - thanks. A fix for this is attached to this e-mail, and
of course will be in ocfs2.git.
> > + }
> > +
> > + ocfs2_set_new_buffer_uptodate(inode, group_bh);
> > +
> > + ret = ocfs2_verify_group_and_input(main_bm_inode, fe, input, group_bh);
> > + if (ret) {
> > + mlog_errno(ret);
> > + goto out_unlock;
> > + }
> > +
> > + mlog(0, "Add a new group %llu in chain = %u, length = %u\n",
> > + input->group, input->chain, input->clusters);
> > +
> > + handle = ocfs2_start_trans(osb, OCFS2_GROUP_ADD_CREDITS);
> > + if (IS_ERR(handle)) {
> > + mlog_errno(PTR_ERR(handle));
> > + ret = -EINVAL;
> > + goto out_unlock;
> > + }
> > +
> > + cl_bpc = le16_to_cpu(fe->id2.i_chain.cl_bpc);
> > + cl = &fe->id2.i_chain;
> > + cr = &cl->cl_recs[input->chain];
> > +
> > + ret = ocfs2_journal_access(handle, main_bm_inode, group_bh,
> > + OCFS2_JOURNAL_ACCESS_WRITE);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_commit;
> > + }
> > +
> > + group = (struct ocfs2_group_desc *)group_bh->b_data;
> > + group->bg_next_group = cr->c_blkno;
> > +
> > + ret = ocfs2_journal_dirty(handle, group_bh);
> > + if (ret < 0) {
> > + mlog_errno(ret);
> > + goto out_commit;
> > + }
> > +
> > + ret = ocfs2_journal_access(handle, main_bm_inode, main_bm_bh,
> > + OCFS2_JOURNAL_ACCESS_WRITE);
>
> hm. Do we need to do that again?
I think you're missing that those are two different buffers :)
> > + spin_lock(&OCFS2_I(main_bm_inode)->ip_lock);
> > + OCFS2_I(main_bm_inode)->ip_clusters = le32_to_cpu(fe->i_clusters);
> > + le64_add_cpu(&fe->i_size, input->clusters << osb->s_clustersize_bits);
> > + spin_unlock(&OCFS2_I(main_bm_inode)->ip_lock);
> > + i_size_write(main_bm_inode, le64_to_cpu(fe->i_size));
>
> Is i_mutex held?
Yes.
--Mark
--
Mark Fasheh
Principal Software Developer, Oracle
mark.fasheh@oracle.com
From: Mark Fasheh <mark.fasheh@oracle.com>
ocfs2: fix goto error in ocfs2_group_add()
We were jumping to the wrong label on read error which would have caused us
to exit the function with locks held.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
---
fs/ocfs2/resize.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/resize.c b/fs/ocfs2/resize.c
index a103698..261dabf 100644
--- a/fs/ocfs2/resize.c
+++ b/fs/ocfs2/resize.c
@@ -544,7 +544,7 @@ int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input)
if (ret < 0) {
mlog(ML_ERROR, "Can't read the group descriptor # %llu "
"from the device.", (unsigned long long)input->group);
- goto out;
+ goto out_unlock;
}
ocfs2_set_new_buffer_uptodate(inode, group_bh);
--
1.5.3.6
next prev parent reply other threads:[~2008-01-23 18:48 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-17 14:37 [Ocfs2-devel] [PATCH 0/30] Ocfs2 and Configfs patches for 2.6.25-rc1 Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 20/30] ocfs2: Silence false lockdep warnings Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 30/30] configfs: file.c fix possible recursive locking Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 16/30] ocfs2: Support commit= mount option Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 24/30] ocfs2: Update default cluster timeouts Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 27/30] ocfs2: bump version number Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 09/30] ocfs2: Documentation update Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 15/30] ocfs2: build warnings fix Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 07/30] dlm: Split lock mode and flag constants into a sharable header Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 05/30] ocfs2: Remove data locks Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 21/30] ocfs2: Safer read_inline_data() Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 04/30] ocfs2: Add data downconvert worker to inode lock Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 10/30] ocfs2: Initalize bitmap_cpg of ocfs2_super to be the maximum Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 03/30] ocfs2: Remove mount/unmount votes Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-23 22:05 ` Andrew Morton
2008-01-23 22:05 ` [Ocfs2-devel] " Andrew Morton
2008-01-23 17:02 ` [Ocfs2-devel] " Mark Fasheh
2008-01-24 1:02 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 18/30] ocfs2: add flock lock type Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 12/30] ocfs2: Add group extend for online resize Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-23 22:05 ` Andrew Morton
2008-01-23 22:06 ` [Ocfs2-devel] " Andrew Morton
2008-01-23 19:15 ` [Ocfs2-devel] " Mark Fasheh
2008-01-24 3:14 ` Mark Fasheh
2008-01-17 14:36 ` [Ocfs2-devel] [PATCH 25/30] ocfs2: convert byte order of constant instead of variable Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 08/30] ocfs2: Readpages support Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-23 22:05 ` Andrew Morton
2008-01-23 22:05 ` [Ocfs2-devel] " Andrew Morton
2008-01-23 17:21 ` [Ocfs2-devel] " Mark Fasheh
2008-01-24 1:20 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 02/30] ocfs2: Remove fs dependency on ocfs2_heartbeat module Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 14/30] ocfs2: Add missing permission checks Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 01/30] ocfs2_dlm: Call node eviction callbacks from heartbeat handler Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 23/30] ocfs2: printf fixes Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 29/30] configfs: dir.c fix possible recursive locking Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 06/30] ocfs2: Rename ocfs2_meta_[un]lock Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 19/30] ocfs2: cluster aware flock() Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 11/30] ocfs2: Reserve ioctl range Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 28/30] configfs: Remove EXPERIMENTAL Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 26/30] ocfs2/dlm: Clear joining_node on hearbeat node down Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 17/30] ocfs2: Local alloc window size changeable via mount option Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 22/30] ocfs2: Use generic_file_llseek Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-17 14:37 ` [Ocfs2-devel] [PATCH 13/30] ocfs2: Implement group add for online resize Mark Fasheh
2008-01-17 22:35 ` Mark Fasheh
2008-01-23 22:05 ` Andrew Morton
2008-01-23 22:08 ` [Ocfs2-devel] " Andrew Morton
2008-01-23 18:48 ` Mark Fasheh [this message]
2008-01-24 2:48 ` Mark Fasheh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080124024844.GP23506@ca-server1.us.oracle.com \
--to=mark.fasheh@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=tao.ma@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.