* [PATCH] Ext3 online resizing locking issue
@ 2005-08-24 21:03 Glauber de Oliveira Costa
2005-08-25 19:02 ` Stephen C. Tweedie
0 siblings, 1 reply; 7+ messages in thread
From: Glauber de Oliveira Costa @ 2005-08-24 21:03 UTC (permalink / raw)
To: ext2-devel, ext2resize-devel, linux-kernel, linux-fsdevel,
adilger, sct, akpm
This simple patch provides a fix for a locking issue found in the online
resizing code. The problem actually happened while trying to resize the
filesystem trough the resize=xxx option in a remount.
Signed-off-by: Glauber de Oliveira Costa <gocosta@br.ibm.com>
diff -up linux-2.6.13-rc6-orig/fs/ext3/ioctl.c linux/fs/ext3/ioctl.c
--- linux-2.6.13-rc6-orig/fs/ext3/ioctl.c 2005-08-24 17:48:22.000000000 -0300
+++ linux/fs/ext3/ioctl.c 2005-08-24 15:12:48.000000000 -0300
@@ -206,7 +206,9 @@ flags_err:
if (get_user(n_blocks_count, (__u32 __user *)arg))
return -EFAULT;
+ lock_super(sb);
err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
+ unlock_super(sb);
journal_lock_updates(EXT3_SB(sb)->s_journal);
journal_flush(EXT3_SB(sb)->s_journal);
journal_unlock_updates(EXT3_SB(sb)->s_journal);
Only in linux/fs/ext3: patch-mnt_resize
diff -up linux-2.6.13-rc6-orig/fs/ext3/resize.c linux/fs/ext3/resize.c
--- linux-2.6.13-rc6-orig/fs/ext3/resize.c 2005-08-24 17:48:22.000000000 -0300
+++ linux/fs/ext3/resize.c 2005-08-24 15:15:28.000000000 -0300
@@ -884,7 +884,9 @@ exit_put:
/* Extend the filesystem to the new number of blocks specified. This entry
* point is only used to extend the current filesystem to the end of the last
* existing group. It can be accessed via ioctl, or by "remount,resize=<size>"
- * for emergencies (because it has no dependencies on reserved blocks).
+ * for emergencies (because it has no dependencies on reserved blocks).
+ *
+ * It should be called with sb->s_lock held
*
* If we _really_ wanted, we could use default values to call ext3_group_add()
* allow the "remount" trick to work for arbitrary resizing, assuming enough
@@ -959,7 +961,6 @@ int ext3_group_extend(struct super_block
goto exit_put;
}
- lock_super(sb);
if (o_blocks_count != le32_to_cpu(es->s_blocks_count)) {
ext3_warning(sb, __FUNCTION__,
"multiple resizers run on filesystem!\n");
@@ -978,7 +979,6 @@ int ext3_group_extend(struct super_block
es->s_blocks_count = cpu_to_le32(o_blocks_count + add);
ext3_journal_dirty_metadata(handle, EXT3_SB(sb)->s_sbh);
sb->s_dirt = 1;
- unlock_super(sb);
ext3_debug("freeing blocks %ld through %ld\n", o_blocks_count,
o_blocks_count + add);
ext3_free_blocks_sb(handle, sb, o_blocks_count, add, &freed_blocks);
diff -up linux-2.6.13-rc6-orig/fs/ext3/super.c linux/fs/ext3/super.c
--- linux-2.6.13-rc6-orig/fs/ext3/super.c 2005-08-24 17:48:22.000000000 -0300
+++ linux/fs/ext3/super.c 2005-08-24 15:13:16.000000000 -0300
@@ -639,8 +639,8 @@ static match_table_t tokens = {
{Opt_quota, "quota"},
{Opt_quota, "usrquota"},
{Opt_barrier, "barrier=%u"},
+ {Opt_resize, "resize=%u"},
{Opt_err, NULL},
- {Opt_resize, "resize"},
};
static unsigned long get_sb_block(void **data)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Ext3 online resizing locking issue
2005-08-24 21:03 [PATCH] Ext3 online resizing locking issue Glauber de Oliveira Costa
@ 2005-08-25 19:02 ` Stephen C. Tweedie
2005-08-25 20:43 ` Glauber de Oliveira Costa
0 siblings, 1 reply; 7+ messages in thread
From: Stephen C. Tweedie @ 2005-08-25 19:02 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: ext2-devel@lists.sourceforge.net, ext2resize-devel, linux-kernel,
linux-fsdevel, Andreas Dilger, Andrew Morton, Stephen Tweedie,
Al Viro
Hi,
On Wed, 2005-08-24 at 22:03, Glauber de Oliveira Costa wrote:
> This simple patch provides a fix for a locking issue found in the online
> resizing code. The problem actually happened while trying to resize the
> filesystem trough the resize=xxx option in a remount.
NAK, this is wrong:
> + lock_super(sb);
> err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> + unlock_super(sb);
This basically reverses the order of locking between lock_super() and
journal_start() (the latter acts like a lock because it can block on a
resource if the journal is too full for the new transaction.) That's
the opposite order to normal, and will result in a potential deadlock.
> + {Opt_resize, "resize=%u"},
> {Opt_err, NULL},
> - {Opt_resize, "resize"},
Right, that's disabled for now. I guess the easy fix here is just to
remove the code entirely, given that we have locking problems with
trying to fix it!
But the _right_ fix, if you really want to keep that code, is probably
to move all the resize locking to a separate lock that ranks outside the
journal_start. The easy workaround is to drop the superblock lock and
reaquire it around the journal_start(); it would be pretty easy to make
that work robustly as far as ext3 is concerned, but I suspect there may
be VFS-layer problems if we start dropping the superblock lock in the
middle of the s_ops->remount() call --- Al?
--Stephen
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Ext3 online resizing locking issue
2005-08-25 19:02 ` Stephen C. Tweedie
@ 2005-08-25 20:43 ` Glauber de Oliveira Costa
2005-08-30 14:06 ` Stephen C. Tweedie
0 siblings, 1 reply; 7+ messages in thread
From: Glauber de Oliveira Costa @ 2005-08-25 20:43 UTC (permalink / raw)
To: Stephen C. Tweedie
Cc: Glauber de Oliveira Costa, ext2-devel@lists.sourceforge.net,
ext2resize-devel, linux-kernel, linux-fsdevel, Andreas Dilger,
Andrew Morton, Al Viro
> NAK, this is wrong:
>
> > + lock_super(sb);
> > err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> > + unlock_super(sb);
>
> This basically reverses the order of locking between lock_super() and
> journal_start() (the latter acts like a lock because it can block on a
> resource if the journal is too full for the new transaction.) That's
> the opposite order to normal, and will result in a potential deadlock.
>
Ooops! Missed that. But I agree with the point.
> But the _right_ fix, if you really want to keep that code, is probably
> to move all the resize locking to a separate lock that ranks outside the
> journal_start. The easy workaround is to drop the superblock lock and
> reaquire it around the journal_start(); it would be pretty easy to make
> that work robustly as far as ext3 is concerned, but I suspect there may
> be VFS-layer problems if we start dropping the superblock lock in the
> middle of the s_ops->remount() call --- Al?
>
Just a question here. With s_lock held by the remount code, we're
altering the struct super_block, and believing we're safe. We try to
acquire it inside the resize functions, because we're trying to modify
this same data. Thus, if we rely on another lock, aren't we probably
messing up something ? (for example, both group_extend and remount code
potentially modify s_flags field. If we ioctl and remount at the same time,
each one with a different lock, something could go wrong). Am I missing
something here ?
glauber
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Ext3 online resizing locking issue
2005-08-25 20:43 ` Glauber de Oliveira Costa
@ 2005-08-30 14:06 ` Stephen C. Tweedie
2005-08-31 11:35 ` Glauber de Oliveira Costa
0 siblings, 1 reply; 7+ messages in thread
From: Stephen C. Tweedie @ 2005-08-30 14:06 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: ext2-devel@lists.sourceforge.net, ext2resize-devel, linux-kernel,
linux-fsdevel, Andreas Dilger, Andrew Morton, Al Viro,
Stephen Tweedie
Hi,
On Thu, 2005-08-25 at 21:43, Glauber de Oliveira Costa wrote:
> Just a question here. With s_lock held by the remount code, we're
> altering the struct super_block, and believing we're safe. We try to
> acquire it inside the resize functions, because we're trying to modify
> this same data. Thus, if we rely on another lock, aren't we probably
> messing up something ?
The two different uses of the superblock lock are really quite
different; I don't see any particular problem with using two different
locks for the two different things. Mount and the namespace code are
not locking the same thing --- the fact that the resize code uses the
superblock lock is really a historical side-effect of the fact that we
used to use the same overloaded superblock lock in the ext2/ext3 block
allocation layers to guard bitmap access.
--Stephen
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Ext3 online resizing locking issue
2005-08-30 14:06 ` Stephen C. Tweedie
@ 2005-08-31 11:35 ` Glauber de Oliveira Costa
2005-08-31 13:30 ` Stephen C. Tweedie
0 siblings, 1 reply; 7+ messages in thread
From: Glauber de Oliveira Costa @ 2005-08-31 11:35 UTC (permalink / raw)
To: Stephen C. Tweedie
Cc: Glauber de Oliveira Costa, ext2-devel@lists.sourceforge.net,
ext2resize-devel, linux-kernel, linux-fsdevel, Andreas Dilger,
Andrew Morton, Al Viro
>
> The two different uses of the superblock lock are really quite
> different; I don't see any particular problem with using two different
> locks for the two different things. Mount and the namespace code are
> not locking the same thing --- the fact that the resize code uses the
> superblock lock is really a historical side-effect of the fact that we
> used to use the same overloaded superblock lock in the ext2/ext3 block
> allocation layers to guard bitmap access.
>
>
At a first look, i thought about locking gdt-related data. But in a
closer one, it seemed to me that we're in fact modifying a little bit
more than that in the resize code. But all these modifications seem to
be somehow related to the ext3 super block specific data in
ext3_sb_info. My first naive approach would be adding a lock to that
struct
Besides that, by doing that, we become pretty much independent of vfs
locking decisions to handle ext3 data. Do you think it all make sense?
--
=====================================
Glauber de Oliveira Costa
IBM Linux Technology Center - Brazil
gocosta@br.ibm.com
=====================================
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Ext3 online resizing locking issue
2005-08-31 11:35 ` Glauber de Oliveira Costa
@ 2005-08-31 13:30 ` Stephen C. Tweedie
2005-09-01 23:04 ` [PATCH][RFC] Ext3 online resizing locking issue (Again) Glauber de Oliveira Costa
0 siblings, 1 reply; 7+ messages in thread
From: Stephen C. Tweedie @ 2005-08-31 13:30 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: ext2-devel@lists.sourceforge.net, ext2resize-devel, linux-kernel,
linux-fsdevel, Andreas Dilger, Andrew Morton, Al Viro,
Stephen Tweedie
Hi,
On Wed, 2005-08-31 at 12:35, Glauber de Oliveira Costa wrote:
> At a first look, i thought about locking gdt-related data. But in a
> closer one, it seemed to me that we're in fact modifying a little bit
> more than that in the resize code. But all these modifications seem to
> be somehow related to the ext3 super block specific data in
> ext3_sb_info. My first naive approach would be adding a lock to that
> struct
I took great care when making that code SMP-safe to avoid such locks,
for performance reasons. See the comments at
* We need to protect s_groups_count against other CPUs seeing
* inconsistent state in the superblock.
in fs/ext3/resize.c for the rules. But basically the way it works is
that we only usually modify data that cannot be in use by other parts of
the kernel --- and that's fairly easy to guarantee, since by definition
extending the fs is something that is touching bits that aren't already
in use. Only once all the new data is safely installed do we atomically
update the s_groups_count field, which instantly makes the new data
visible. We enforce this ordering via smp read barriers before reading
s_groups_count and write barriers after modifying it, but we don't
actually have locks as such.
The only use of locking in the resize is hence the superblock lock,
which is not really there to protect the resize from the rest of the fs
--- the s_groups_count barriers do that. All the sb lock is needed for
is to prevent two resizes from progressing at the same time; and that
could easily be abstracted into a separate resize lock.
Cheers,
Stephen
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH][RFC] Ext3 online resizing locking issue (Again)
2005-08-31 13:30 ` Stephen C. Tweedie
@ 2005-09-01 23:04 ` Glauber de Oliveira Costa
0 siblings, 0 replies; 7+ messages in thread
From: Glauber de Oliveira Costa @ 2005-09-01 23:04 UTC (permalink / raw)
To: Stephen C. Tweedie
Cc: Glauber de Oliveira Costa, ext2-devel@lists.sourceforge.net,
ext2resize-devel, linux-kernel, linux-fsdevel, Andreas Dilger,
Andrew Morton, Al Viro
[-- Attachment #1: Type: text/plain, Size: 966 bytes --]
Hi.
Here is my new trial for the resize lock issue.
Basically, it goes as follows:
To ensure that only one resizer is running at a time, I added a global
lock that is acquired in the very beginning of ext3_group_add and
ext3_group_extend.
lock_super is now only used in ext3_group_add in the moment we alter
s_groups_count, and released after the super block is marked dirty.
In ext3_group_extend, this is done outside the main function, so we can
do it trusting the lock to be already held while in remount, or
acquiring it explicitly while in ioctl.
The lock in ext3_setup_new_group_blocks was simply wiped out, since this
is always called from one of the functions that already holds the lock
(and thus, in a safe environment)
Signed-off-by: Glauber de Oliveira Costa <gocosta@br.ibm.com>
--
=====================================
Glauber de Oliveira Costa
IBM Linux Technology Center - Brazil
gocosta@br.ibm.com
=====================================
[-- Attachment #2: patch-resize_lock --]
[-- Type: text/plain, Size: 6439 bytes --]
Only in linux/fs/ext3/: .tmp_versions
Only in linux/fs/ext3/: fileidx
diff -bup linux-2.6.13-orig/fs/ext3/ioctl.c linux/fs/ext3/ioctl.c
--- linux-2.6.13-orig/fs/ext3/ioctl.c 2005-09-01 12:44:31.000000000 -0300
+++ linux/fs/ext3/ioctl.c 2005-09-01 11:42:22.000000000 -0300
@@ -207,6 +207,12 @@ flags_err:
return -EFAULT;
err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
+ if (!err){
+ lock_super(sb);
+ sb->s_dirt = 1;
+ unlock_super(sb);
+ }
+
journal_lock_updates(EXT3_SB(sb)->s_journal);
journal_flush(EXT3_SB(sb)->s_journal);
journal_unlock_updates(EXT3_SB(sb)->s_journal);
Only in linux/fs/ext3/: patch
diff -bup linux-2.6.13-orig/fs/ext3/resize.c linux/fs/ext3/resize.c
--- linux-2.6.13-orig/fs/ext3/resize.c 2005-09-01 12:44:31.000000000 -0300
+++ linux/fs/ext3/resize.c 2005-09-01 19:30:40.000000000 -0300
@@ -23,6 +23,8 @@
#define outside(b, first, last) ((b) < (first) || (b) >= (last))
#define inside(b, first, last) ((b) >= (first) && (b) < (last))
+DECLARE_MUTEX(resize_lock);
+
static int verify_group_input(struct super_block *sb,
struct ext3_new_group_data *input)
{
@@ -178,7 +180,6 @@ static int setup_new_group_blocks(struct
if (IS_ERR(handle))
return PTR_ERR(handle);
- lock_super(sb);
if (input->group != sbi->s_groups_count) {
err = -EBUSY;
goto exit_journal;
@@ -194,6 +195,7 @@ static int setup_new_group_blocks(struct
ext3_set_bit(0, bh->b_data);
}
+ smp_rmb();
/* Copy all of the GDT blocks into the backup in this group */
for (i = 0, bit = 1, block = start + 1;
i < gdblocks; i++, block++, bit++) {
@@ -271,7 +273,6 @@ exit_bh:
brelse(bh);
exit_journal:
- unlock_super(sb);
if ((err2 = ext3_journal_stop(handle)) && !err)
err = err2;
@@ -706,6 +707,11 @@ int ext3_group_add(struct super_block *s
int gdb_off, gdb_num;
int err, err2;
+ if (unlikely(down_trylock(&resize_lock))){
+ ext3_warning(sb,__FUNCTION__,"multiple resizers run on filesystem. Aborting\n");
+ return -EBUSY;
+ }
+
gdb_num = input->group / EXT3_DESC_PER_BLOCK(sb);
gdb_off = input->group % EXT3_DESC_PER_BLOCK(sb);
@@ -753,12 +759,6 @@ int ext3_group_add(struct super_block *s
goto exit_put;
}
- lock_super(sb);
- if (input->group != EXT3_SB(sb)->s_groups_count) {
- ext3_warning(sb, __FUNCTION__,
- "multiple resizers run on filesystem!\n");
- goto exit_journal;
- }
if ((err = ext3_journal_get_write_access(handle, sbi->s_sbh)))
goto exit_journal;
@@ -847,6 +847,7 @@ int ext3_group_add(struct super_block *s
*/
smp_wmb();
+ lock_super(sb);
/* Update the global fs size fields */
EXT3_SB(sb)->s_groups_count++;
@@ -865,9 +866,9 @@ int ext3_group_add(struct super_block *s
ext3_journal_dirty_metadata(handle, EXT3_SB(sb)->s_sbh);
sb->s_dirt = 1;
+ unlock_super(sb);
exit_journal:
- unlock_super(sb);
if ((err2 = ext3_journal_stop(handle)) && !err)
err = err2;
if (!err) {
@@ -877,6 +878,7 @@ exit_journal:
primary->b_size);
}
exit_put:
+ up(&resize_lock);
iput(inode);
return err;
} /* ext3_group_add */
@@ -901,6 +903,12 @@ int ext3_group_extend(struct super_block
handle_t *handle;
int err, freed_blocks;
+
+ if (unlikely(down_trylock(&resize_lock))){
+ ext3_warning(sb,__FUNCTION__,"multiple resizers run on filesystem. Aborting\n");
+ return -EBUSY;
+ }
+
/* We don't need to worry about locking wrt other resizers just
* yet: we're going to revalidate es->s_blocks_count after
* taking lock_super() below. */
@@ -911,13 +919,15 @@ int ext3_group_extend(struct super_block
printk(KERN_DEBUG "EXT3-fs: extending last group from %lu to %lu blocks\n",
o_blocks_count, n_blocks_count);
+ err = 0;
if (n_blocks_count == 0 || n_blocks_count == o_blocks_count)
- return 0;
+ goto exit_put;
if (n_blocks_count < o_blocks_count) {
ext3_warning(sb, __FUNCTION__,
"can't shrink FS - resize aborted");
- return -EBUSY;
+ err = -EBUSY;
+ goto exit_put;
}
/* Handle the remaining blocks in the last group only. */
@@ -927,7 +937,8 @@ int ext3_group_extend(struct super_block
if (last == 0) {
ext3_warning(sb, __FUNCTION__,
"need to use ext2online to resize further\n");
- return -EPERM;
+ err = -EPERM;
+ goto exit_put;
}
add = EXT3_BLOCKS_PER_GROUP(sb) - last;
@@ -945,7 +956,8 @@ int ext3_group_extend(struct super_block
if (!bh) {
ext3_warning(sb, __FUNCTION__,
"can't read last block, resize aborted");
- return -ENOSPC;
+ err = -ENOSPC;
+ goto exit_put;
}
brelse(bh);
@@ -959,26 +971,15 @@ int ext3_group_extend(struct super_block
goto exit_put;
}
- lock_super(sb);
- if (o_blocks_count != le32_to_cpu(es->s_blocks_count)) {
- ext3_warning(sb, __FUNCTION__,
- "multiple resizers run on filesystem!\n");
- err = -EBUSY;
- goto exit_put;
- }
-
if ((err = ext3_journal_get_write_access(handle,
EXT3_SB(sb)->s_sbh))) {
ext3_warning(sb, __FUNCTION__,
"error %d on journal write access", err);
- unlock_super(sb);
ext3_journal_stop(handle);
goto exit_put;
}
es->s_blocks_count = cpu_to_le32(o_blocks_count + add);
ext3_journal_dirty_metadata(handle, EXT3_SB(sb)->s_sbh);
- sb->s_dirt = 1;
- unlock_super(sb);
ext3_debug("freeing blocks %ld through %ld\n", o_blocks_count,
o_blocks_count + add);
ext3_free_blocks_sb(handle, sb, o_blocks_count, add, &freed_blocks);
@@ -992,5 +993,6 @@ int ext3_group_extend(struct super_block
update_backups(sb, EXT3_SB(sb)->s_sbh->b_blocknr, (char *)es,
sizeof(struct ext3_super_block));
exit_put:
+ up(&resize_lock);
return err;
} /* ext3_group_extend */
diff -bup linux-2.6.13-orig/fs/ext3/super.c linux/fs/ext3/super.c
--- linux-2.6.13-orig/fs/ext3/super.c 2005-09-01 12:44:31.000000000 -0300
+++ linux/fs/ext3/super.c 2005-09-01 12:02:09.000000000 -0300
@@ -639,8 +639,8 @@ static match_table_t tokens = {
{Opt_quota, "quota"},
{Opt_quota, "usrquota"},
{Opt_barrier, "barrier=%u"},
+ {Opt_resize, "resize=%u"},
{Opt_err, NULL},
- {Opt_resize, "resize"},
};
static unsigned long get_sb_block(void **data)
@@ -2195,6 +2195,8 @@ static int ext3_remount (struct super_bl
sbi->s_mount_state = le16_to_cpu(es->s_state);
if ((ret = ext3_group_extend(sb, es, n_blocks_count))) {
err = ret;
+ if (!err)
+ sb->s_dirt = 1;
goto restore_opts;
}
if (!ext3_setup_super (sb, es, 0))
Only in linux/fs/ext3/: xref
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-09-01 23:02 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-24 21:03 [PATCH] Ext3 online resizing locking issue Glauber de Oliveira Costa
2005-08-25 19:02 ` Stephen C. Tweedie
2005-08-25 20:43 ` Glauber de Oliveira Costa
2005-08-30 14:06 ` Stephen C. Tweedie
2005-08-31 11:35 ` Glauber de Oliveira Costa
2005-08-31 13:30 ` Stephen C. Tweedie
2005-09-01 23:04 ` [PATCH][RFC] Ext3 online resizing locking issue (Again) Glauber de Oliveira Costa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).