[PATCH,STABLE 2.6.29 01/18] ext4: don't inherit inappropriate inode flags from parent

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH,STABLE 2.6.29 01/18] ext4: don't inherit inappropriate inode flags from parent
@ 2009-06-02 12:07 Theodore Ts'o
  2009-06-02 12:07 ` [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags Theodore Ts'o
  2009-06-09  9:33 ` patch ext4-don-t-inherit-inappropriate-inode-flags-from-parent.patch " gregkh
  0 siblings, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Duane Griffin, Andrew Morton, Theodore Ts'o

From: Duane Griffin <duaneg@dghda.com>

At present INDEX and EXTENTS are the only flags that new ext4 inodes do
NOT inherit from their parent.  In addition prevent the flags DIRTY,
ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited.
List inheritable flags explicitly to prevent future flags from
accidentally being inherited.

This fixes the TOPDIR flag inheritance bug reported at
http://bugzilla.kernel.org/show_bug.cgi?id=9866.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 8fa43a81b97853fc69417bb6054182e78f95cbeb)
---
 fs/ext4/ext4.h   |    7 +++++++
 fs/ext4/ialloc.c |    2 +-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 90909f9..45af699 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -248,6 +248,13 @@ struct flex_groups {
 #define EXT4_FL_USER_VISIBLE		0x000BDFFF /* User visible flags */
 #define EXT4_FL_USER_MODIFIABLE		0x000B80FF /* User modifiable flags */
 
+/* Flags that should be inherited by new inodes from their parent. */
+#define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
+			   EXT4_SYNC_FL | EXT4_IMMUTABLE_FL | EXT4_APPEND_FL |\
+			   EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
+			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
+			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
+
 /*
  * Inode dynamic state flags
  */
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 2d2b358..6f09543 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -889,7 +889,7 @@ got:
 	 * newly created directory and file only if -o extent mount option is
 	 * specified
 	 */
-	ei->i_flags = EXT4_I(dir)->i_flags & ~(EXT4_INDEX_FL|EXT4_EXTENTS_FL);
+	ei->i_flags = EXT4_I(dir)->i_flags & EXT4_FL_INHERITED;
 	if (S_ISLNK(mode))
 		ei->i_flags &= ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL);
 	/* dirsync only applies to directories */
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags
  2009-06-02 12:07 [PATCH,STABLE 2.6.29 01/18] ext4: don't inherit inappropriate inode flags from parent Theodore Ts'o
@ 2009-06-02 12:07 ` Theodore Ts'o
  2009-06-02 12:07   ` [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode Theodore Ts'o
  2009-06-09  9:33   ` patch ext4-tighten-restrictions-on-inode-flags.patch " gregkh
  2009-06-09  9:33 ` patch ext4-don-t-inherit-inappropriate-inode-flags-from-parent.patch " gregkh
  1 sibling, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Duane Griffin, Andrew Morton, Theodore Ts'o

From: Duane Griffin <duaneg@dghda.com>

At the moment there are few restrictions on which flags may be set on
which inodes.  Specifically DIRSYNC may only be set on directories and
IMMUTABLE and APPEND may not be set on links.  Tighten that to disallow
TOPDIR being set on non-directories and only NODUMP and NOATIME to be set
on non-regular file, non-directories.

Introduces a flags masking function which masks flags based on mode and
use it during inode creation and when flags are set via the ioctl to
facilitate future consistency.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 2dc6b0d48ca0599837df21b14bb8393d0804af57)
---
 fs/ext4/ext4.h   |   17 +++++++++++++++++
 fs/ext4/ialloc.c |   14 +++++---------
 fs/ext4/ioctl.c  |    3 +--
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 45af699..6a954de 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -255,6 +255,23 @@ struct flex_groups {
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
 
+/* Flags that are appropriate for regular files (all but dir-specific ones). */
+#define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL))
+
+/* Flags that are appropriate for non-directories/regular files. */
+#define EXT4_OTHER_FLMASK (EXT4_NODUMP_FL | EXT4_NOATIME_FL)
+
+/* Mask out flags that are inappropriate for the given type of inode. */
+static inline __u32 ext4_mask_flags(umode_t mode, __u32 flags)
+{
+	if (S_ISDIR(mode))
+		return flags;
+	else if (S_ISREG(mode))
+		return flags & EXT4_REG_FLMASK;
+	else
+		return flags & EXT4_OTHER_FLMASK;
+}
+
 /*
  * Inode dynamic state flags
  */
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 6f09543..befd95e 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -885,16 +885,12 @@ got:
 	ei->i_disksize = 0;
 
 	/*
-	 * Don't inherit extent flag from directory. We set extent flag on
-	 * newly created directory and file only if -o extent mount option is
-	 * specified
+	 * Don't inherit extent flag from directory, amongst others. We set
+	 * extent flag on newly created directory and file only if -o extent
+	 * mount option is specified
 	 */
-	ei->i_flags = EXT4_I(dir)->i_flags & EXT4_FL_INHERITED;
-	if (S_ISLNK(mode))
-		ei->i_flags &= ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL);
-	/* dirsync only applies to directories */
-	if (!S_ISDIR(mode))
-		ei->i_flags &= ~EXT4_DIRSYNC_FL;
+	ei->i_flags =
+		ext4_mask_flags(mode, EXT4_I(dir)->i_flags & EXT4_FL_INHERITED);
 	ei->i_file_acl = 0;
 	ei->i_dtime = 0;
 	ei->i_block_group = group;
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 42dc83f..22dd29f 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -48,8 +48,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		if (err)
 			return err;
 
-		if (!S_ISDIR(inode->i_mode))
-			flags &= ~EXT4_DIRSYNC_FL;
+		flags = ext4_mask_flags(inode->i_mode, flags);
 
 		err = -EPERM;
 		mutex_lock(&inode->i_mutex);
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode
  2009-06-02 12:07 ` [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags Theodore Ts'o
@ 2009-06-02 12:07   ` Theodore Ts'o
  2009-06-02 12:07     ` [PATCH,STABLE 2.6.29 04/18] ext4: Add fine print for the 32000 subdirectory limit Theodore Ts'o
  2009-06-09  9:33     ` patch ext4-return-eio-not-estale-on-directory-traversal-through-deleted-inode.patch added to 2.6.29-stable tree gregkh
  2009-06-09  9:33   ` patch ext4-tighten-restrictions-on-inode-flags.patch " gregkh
  1 sibling, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Bryan Donlan, Andrew Morton, Theodore Ts'o

From: Bryan Donlan <bdonlan@gmail.com>

ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly.  However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode.  This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.

The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.

This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.

Signed-off-by: Bryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit e6f009b0b45220c004672d41a58865e94946104d)
---
 fs/ext4/namei.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index ba702bd..f787234 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1052,8 +1052,16 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, stru
 			return ERR_PTR(-EIO);
 		}
 		inode = ext4_iget(dir->i_sb, ino);
-		if (IS_ERR(inode))
-			return ERR_CAST(inode);
+		if (unlikely(IS_ERR(inode))) {
+			if (PTR_ERR(inode) == -ESTALE) {
+				ext4_error(dir->i_sb, __func__,
+						"deleted inode referenced: %u",
+						ino);
+				return ERR_PTR(-EIO);
+			} else {
+				return ERR_CAST(inode);
+			}
+		}
 	}
 	return d_splice_alias(inode, dentry);
 }
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 04/18] ext4: Add fine print for the 32000 subdirectory limit
  2009-06-02 12:07   ` [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode Theodore Ts'o
@ 2009-06-02 12:07     ` Theodore Ts'o
  2009-06-02 12:07       ` [PATCH,STABLE 2.6.29 05/18] ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl Theodore Ts'o
  2009-06-09  9:33     ` patch ext4-return-eio-not-estale-on-directory-traversal-through-deleted-inode.patch added to 2.6.29-stable tree gregkh
  1 sibling, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

Some poeple are reading the ext4 feature list too literally and create
dubious test cases involving very long filenames and 1k blocksize and
then complain when they run into an htree-imposed limit.  So add fine
print to the "fix 32000 subdirectory limit" ext4 feature.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 722bde6875bfb49a0c84e5601eb82dd7ac02d27c)
---
 Documentation/filesystems/ext4.txt |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index cec829b..5c484ae 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -85,7 +85,7 @@ Note: More extensive information for getting started with ext4 can be
 * extent format more robust in face of on-disk corruption due to magics,
 * internal redundancy in tree
 * improved file allocation (multi-block alloc)
-* fix 32000 subdirectory limit
+* lift 32000 subdirectory limit imposed by i_links_count[1]
 * nsec timestamps for mtime, atime, ctime, create time
 * inode version field on disk (NFSv4, Lustre)
 * reduced e2fsck time via uninit_bg feature
@@ -100,6 +100,9 @@ Note: More extensive information for getting started with ext4 can be
 * efficent new ordered mode in JBD2 and ext4(avoid using buffer head to force
   the ordering)
 
+[1] Filesystems with a block size of 1k may see a limit imposed by the
+directory hash tree having a maximum depth of two.
+
 2.2 Candidate features for future inclusion
 
 * Online defrag (patches available but not well tested)
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 05/18] ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl
  2009-06-02 12:07     ` [PATCH,STABLE 2.6.29 04/18] ext4: Add fine print for the 32000 subdirectory limit Theodore Ts'o
@ 2009-06-02 12:07       ` Theodore Ts'o
  2009-06-02 12:07         ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

Add an ioctl which forces all of the delay allocated blocks to be
allocated.  This also provides a function ext4_alloc_da_blocks() which
will be used by the following commits to force files to be fully
allocated to preserve application-expected ext3 behaviour.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit ccd2506bd43113659aa904d5bea5d1300605e2a6)
---
 fs/ext4/ext4.h  |    3 +++
 fs/ext4/inode.c |   42 ++++++++++++++++++++++++++++++++++++++++++
 fs/ext4/ioctl.c |   14 ++++++++++++++
 3 files changed, 59 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6a954de..f5552d7 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -326,7 +326,9 @@ struct ext4_new_group_data {
 #define EXT4_IOC_GROUP_EXTEND		_IOW('f', 7, unsigned long)
 #define EXT4_IOC_GROUP_ADD		_IOW('f', 8, struct ext4_new_group_input)
 #define EXT4_IOC_MIGRATE		_IO('f', 9)
+ /* note ioctl 10 reserved for an early version of the FIEMAP ioctl */
  /* note ioctl 11 reserved for filesystem-independent FIEMAP ioctl */
+#define EXT4_IOC_ALLOC_DA_BLKS		_IO('f', 12)
 
 /*
  * ioctl commands in 32 bit emulation
@@ -1115,6 +1117,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern void ext4_truncate(struct inode *);
 extern void ext4_set_inode_flags(struct inode *);
 extern void ext4_get_inode_flags(struct ext4_inode_info *);
+extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
 extern int ext4_meta_trans_blocks(struct inode *, int nrblocks, int idxblocks);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2c0439d..0c93ce0 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2816,6 +2816,48 @@ out:
 	return;
 }
 
+/*
+ * Force all delayed allocation blocks to be allocated for a given inode.
+ */
+int ext4_alloc_da_blocks(struct inode *inode)
+{
+	if (!EXT4_I(inode)->i_reserved_data_blocks &&
+	    !EXT4_I(inode)->i_reserved_meta_blocks)
+		return 0;
+
+	/*
+	 * We do something simple for now.  The filemap_flush() will
+	 * also start triggering a write of the data blocks, which is
+	 * not strictly speaking necessary (and for users of
+	 * laptop_mode, not even desirable).  However, to do otherwise
+	 * would require replicating code paths in:
+	 * 
+	 * ext4_da_writepages() ->
+	 *    write_cache_pages() ---> (via passed in callback function)
+	 *        __mpage_da_writepage() -->
+	 *           mpage_add_bh_to_extent()
+	 *           mpage_da_map_blocks()
+	 *
+	 * The problem is that write_cache_pages(), located in
+	 * mm/page-writeback.c, marks pages clean in preparation for
+	 * doing I/O, which is not desirable if we're not planning on
+	 * doing I/O at all.
+	 *
+	 * We could call write_cache_pages(), and then redirty all of
+	 * the pages by calling redirty_page_for_writeback() but that
+	 * would be ugly in the extreme.  So instead we would need to
+	 * replicate parts of the code in the above functions,
+	 * simplifying them becuase we wouldn't actually intend to
+	 * write out the pages, but rather only collect contiguous
+	 * logical block extents, call the multi-block allocator, and
+	 * then update the buffer heads with the block allocations.
+	 * 
+	 * For now, though, we'll cheat by calling filemap_flush(),
+	 * which will map the blocks, and start the I/O, but not
+	 * actually wait for the I/O to complete.
+	 */
+	return filemap_flush(inode->i_mapping);
+}
 
 /*
  * bmap() is special.  It gets used by applications such as lilo and by
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 22dd29f..91e75f7 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -262,6 +262,20 @@ setversion_out:
 		return err;
 	}
 
+	case EXT4_IOC_ALLOC_DA_BLKS:
+	{
+		int err;
+		if (!is_owner_or_cap(inode))
+			return -EACCES;
+
+		err = mnt_want_write(filp->f_path.mnt);
+		if (err)
+			return err;
+		err = ext4_alloc_da_blocks(inode);
+		mnt_drop_write(filp->f_path.mnt);
+		return err;
+	}
+
 	default:
 		return -ENOTTY;
 	}
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close
  2009-06-02 12:07       ` [PATCH,STABLE 2.6.29 05/18] ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl Theodore Ts'o
@ 2009-06-02 12:07         ` Theodore Ts'o
  2009-06-02 12:07           ` [PATCH,STABLE 2.6.29 07/18] ext4: Automatically allocate delay allocated blocks on rename Theodore Ts'o
  2009-06-03 18:14           ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Andreas Dilger
  0 siblings, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

When closing a file that had been previously truncated, force any
delay allocated blocks that to be allocated so that if the filesystem
is mounted with data=ordered, the data blocks will be pushed out to
disk along with the journal commit.  Many application programs expect
this, so we do this to avoid zero length files if the system crashes
unexpectedly.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 7d8f9f7d150dded7b68e61ca6403a1f166fb4edf)
---
 fs/ext4/ext4.h  |    1 +
 fs/ext4/file.c  |    4 ++++
 fs/ext4/inode.c |    3 +++
 3 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f5552d7..83f685d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -279,6 +279,7 @@ static inline __u32 ext4_mask_flags(umode_t mode, __u32 flags)
 #define EXT4_STATE_NEW			0x00000002 /* inode is newly created */
 #define EXT4_STATE_XATTR		0x00000004 /* has in-inode xattrs */
 #define EXT4_STATE_NO_EXPAND		0x00000008 /* No space for expansion */
+#define EXT4_STATE_DA_ALLOC_CLOSE	0x00000010 /* Alloc DA blks on close */
 
 /* Used to pass group descriptor data when online resize is done */
 struct ext4_new_group_input {
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index f731cb5..06df827 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -33,6 +33,10 @@
  */
 static int ext4_release_file(struct inode *inode, struct file *filp)
 {
+	if (EXT4_I(inode)->i_state & EXT4_STATE_DA_ALLOC_CLOSE) {
+		ext4_alloc_da_blocks(inode);
+		EXT4_I(inode)->i_state &= ~EXT4_STATE_DA_ALLOC_CLOSE;
+	}
 	/* if we are the last writer on the inode, drop the block reservation */
 	if ((filp->f_mode & FMODE_WRITE) &&
 			(atomic_read(&inode->i_writecount) == 1))
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0c93ce0..8c7259a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3880,6 +3880,9 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;
 
+	if (inode->i_size == 0)
+		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;
+
 	if (EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL) {
 		ext4_ext_truncate(inode);
 		return;
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 07/18] ext4: Automatically allocate delay allocated blocks on rename
  2009-06-02 12:07         ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Theodore Ts'o
@ 2009-06-02 12:07           ` Theodore Ts'o
  2009-06-02 12:07             ` [PATCH,STABLE 2.6.29 08/18] ext4: Fix discard of inode prealloc space with delayed allocation Theodore Ts'o
  2009-06-03 18:14           ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Andreas Dilger
  1 sibling, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

When renaming a file such that a link to another inode is overwritten,
force any delay allocated blocks that to be allocated so that if the
filesystem is mounted with data=ordered, the data blocks will be
pushed out to disk along with the journal commit.  Many application
programs expect this, so we do this to avoid zero length files if the
system crashes unexpectedly.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 8750c6d5fcbd3342b3d908d157f81d345c5325a7)
---
 fs/ext4/namei.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index f787234..63568ec 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2319,7 +2319,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct inode *old_inode, *new_inode;
 	struct buffer_head *old_bh, *new_bh, *dir_bh;
 	struct ext4_dir_entry_2 *old_de, *new_de;
-	int retval;
+	int retval, force_da_alloc = 0;
 
 	old_bh = new_bh = dir_bh = NULL;
 
@@ -2457,6 +2457,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 		ext4_mark_inode_dirty(handle, new_inode);
 		if (!new_inode->i_nlink)
 			ext4_orphan_add(handle, new_inode);
+		force_da_alloc = 1;
 	}
 	retval = 0;
 
@@ -2465,6 +2466,8 @@ end_rename:
 	brelse(old_bh);
 	brelse(new_bh);
 	ext4_journal_stop(handle);
+	if (retval == 0 && force_da_alloc)
+		ext4_alloc_da_blocks(old_inode);
 	return retval;
 }
 
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 08/18] ext4: Fix discard of inode prealloc space with delayed allocation.
  2009-06-02 12:07           ` [PATCH,STABLE 2.6.29 07/18] ext4: Automatically allocate delay allocated blocks on rename Theodore Ts'o
@ 2009-06-02 12:07             ` Theodore Ts'o
  2009-06-02 12:07               ` [PATCH,STABLE 2.6.29 09/18] ext4: Add auto_da_alloc mount option Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Aneesh Kumar K.V, Theodore Ts'o

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

With delayed allocation we should not/cannot discard inode prealloc
space during file close. We would still have dirty pages for which we
haven't allocated blocks yet. With this fix after each get_blocks
request we check whether we have zero reserved blocks and if yes and
we don't have any writers on the file we discard inode prealloc space.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit d6014301b5599fba395c42a1e96a7fe86f7d0b2d)
---
 fs/ext4/file.c  |    3 ++-
 fs/ext4/inode.c |    9 ++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 06df827..588af8c 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -39,7 +39,8 @@ static int ext4_release_file(struct inode *inode, struct file *filp)
 	}
 	/* if we are the last writer on the inode, drop the block reservation */
 	if ((filp->f_mode & FMODE_WRITE) &&
-			(atomic_read(&inode->i_writecount) == 1))
+			(atomic_read(&inode->i_writecount) == 1) &&
+		        !EXT4_I(inode)->i_reserved_data_blocks)
 	{
 		down_write(&EXT4_I(inode)->i_data_sem);
 		ext4_discard_preallocations(inode);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8c7259a..3dafa6b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1036,8 +1036,15 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 	/* update per-inode reservations */
 	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
 	EXT4_I(inode)->i_reserved_data_blocks -= used;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 09/18] ext4: Add auto_da_alloc mount option
  2009-06-02 12:07             ` [PATCH,STABLE 2.6.29 08/18] ext4: Fix discard of inode prealloc space with delayed allocation Theodore Ts'o
@ 2009-06-02 12:07               ` Theodore Ts'o
  2009-06-02 12:07                 ` [PATCH,STABLE 2.6.29 10/18] ext4: Check for an valid i_mode when reading the inode from disk Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

Add a mount option which allows the user to disable automatic
allocation of blocks whose allocation by delayed allocation when the
file was originally truncated or when the file is renamed over an
existing file.  This feature is intended to save users from the
effects of naive application writers, but it reduces the effectiveness
of the delayed allocation code.  This mount option disables this
safety feature, which may be desirable for prodcutions systems where
the risk of unclean shutdowns or unexpected system crashes is low.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit afd4672dc7610b7feef5190168aa917cc2e417e4)
---
 fs/ext4/ext4.h  |    2 +-
 fs/ext4/inode.c |    2 +-
 fs/ext4/namei.c |    3 ++-
 fs/ext4/super.c |   25 +++++++++++++------------
 4 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 83f685d..a2bd86e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -557,7 +557,7 @@ do {									       \
 #define EXT4_MOUNT_NO_UID32		0x02000  /* Disable 32-bit UIDs */
 #define EXT4_MOUNT_XATTR_USER		0x04000	/* Extended user attributes */
 #define EXT4_MOUNT_POSIX_ACL		0x08000	/* POSIX Access Control Lists */
-#define EXT4_MOUNT_RESERVATION		0x10000	/* Preallocation */
+#define EXT4_MOUNT_NO_AUTO_DA_ALLOC	0x10000	/* No auto delalloc mapping */
 #define EXT4_MOUNT_BARRIER		0x20000 /* Use block barriers */
 #define EXT4_MOUNT_NOBH			0x40000 /* No bufferheads */
 #define EXT4_MOUNT_QUOTA		0x80000 /* Some quota option set */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3dafa6b..8ff6762 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3887,7 +3887,7 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;
 
-	if (inode->i_size == 0)
+	if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC))
 		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;
 
 	if (EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL) {
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 63568ec..8977e60 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2457,7 +2457,8 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 		ext4_mark_inode_dirty(handle, new_inode);
 		if (!new_inode->i_nlink)
 			ext4_orphan_add(handle, new_inode);
-		force_da_alloc = 1;
+		if (!test_opt(new_dir->i_sb, NO_AUTO_DA_ALLOC))
+			force_da_alloc = 1;
 	}
 	retval = 0;
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 39d1993..1ad3c20 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -803,8 +803,6 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
 	if (!test_opt(sb, POSIX_ACL) && (def_mount_opts & EXT4_DEFM_ACL))
 		seq_puts(seq, ",noacl");
 #endif
-	if (!test_opt(sb, RESERVATION))
-		seq_puts(seq, ",noreservation");
 	if (sbi->s_commit_interval != JBD2_DEFAULT_MAX_COMMIT_AGE*HZ) {
 		seq_printf(seq, ",commit=%u",
 			   (unsigned) (sbi->s_commit_interval / HZ));
@@ -855,6 +853,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
 	if (test_opt(sb, DATA_ERR_ABORT))
 		seq_puts(seq, ",data_err=abort");
 
+	if (test_opt(sb, NO_AUTO_DA_ALLOC))
+		seq_puts(seq, ",auto_da_alloc=0");
+
 	ext4_show_quota_options(seq, sb);
 	return 0;
 }
@@ -1002,7 +1003,7 @@ enum {
 	Opt_resgid, Opt_resuid, Opt_sb, Opt_err_cont, Opt_err_panic, Opt_err_ro,
 	Opt_nouid32, Opt_debug, Opt_oldalloc, Opt_orlov,
 	Opt_user_xattr, Opt_nouser_xattr, Opt_acl, Opt_noacl,
-	Opt_reservation, Opt_noreservation, Opt_noload, Opt_nobh, Opt_bh,
+	Opt_auto_da_alloc, Opt_noload, Opt_nobh, Opt_bh,
 	Opt_commit, Opt_min_batch_time, Opt_max_batch_time,
 	Opt_journal_update, Opt_journal_dev,
 	Opt_journal_checksum, Opt_journal_async_commit,
@@ -1037,8 +1038,6 @@ static const match_table_t tokens = {
 	{Opt_nouser_xattr, "nouser_xattr"},
 	{Opt_acl, "acl"},
 	{Opt_noacl, "noacl"},
-	{Opt_reservation, "reservation"},
-	{Opt_noreservation, "noreservation"},
 	{Opt_noload, "noload"},
 	{Opt_nobh, "nobh"},
 	{Opt_bh, "bh"},
@@ -1073,6 +1072,7 @@ static const match_table_t tokens = {
 	{Opt_nodelalloc, "nodelalloc"},
 	{Opt_inode_readahead_blks, "inode_readahead_blks=%u"},
 	{Opt_journal_ioprio, "journal_ioprio=%u"},
+	{Opt_auto_da_alloc, "auto_da_alloc=%u"},
 	{Opt_err, NULL},
 };
 
@@ -1205,12 +1205,6 @@ static int parse_options(char *options, struct super_block *sb,
 			       "not supported\n");
 			break;
 #endif
-		case Opt_reservation:
-			set_opt(sbi->s_mount_opt, RESERVATION);
-			break;
-		case Opt_noreservation:
-			clear_opt(sbi->s_mount_opt, RESERVATION);
-			break;
 		case Opt_journal_update:
 			/* @@@ FIXME */
 			/* Eventually we will want to be able to create
@@ -1471,6 +1465,14 @@ set_qf_format:
 			*journal_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE,
 							    option);
 			break;
+		case Opt_auto_da_alloc:
+			if (match_int(&args[0], &option))
+				return 0;
+			if (option)
+				clear_opt(sbi->s_mount_opt, NO_AUTO_DA_ALLOC);
+			else
+				set_opt(sbi->s_mount_opt,NO_AUTO_DA_ALLOC);
+			break;
 		default:
 			printk(KERN_ERR
 			       "EXT4-fs: Unrecognized mount option \"%s\" "
@@ -2099,7 +2101,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	sbi->s_min_batch_time = EXT4_DEF_MIN_BATCH_TIME;
 	sbi->s_max_batch_time = EXT4_DEF_MAX_BATCH_TIME;
 
-	set_opt(sbi->s_mount_opt, RESERVATION);
 	set_opt(sbi->s_mount_opt, BARRIER);
 
 	/*
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 10/18] ext4: Check for an valid i_mode when reading the inode from disk
  2009-06-02 12:07               ` [PATCH,STABLE 2.6.29 09/18] ext4: Add auto_da_alloc mount option Theodore Ts'o
@ 2009-06-02 12:07                 ` Theodore Ts'o
  2009-06-02 12:07                   ` [PATCH,STABLE 2.6.29 11/18] jbd2: Update locking coments Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 563bdd61fe4dbd6b58cf7eb06f8d8f14479ae1dc)
---
 fs/ext4/inode.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8ff6762..c4f0e14 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4367,7 +4367,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 			inode->i_op = &ext4_symlink_inode_operations;
 			ext4_set_aops(inode);
 		}
-	} else {
+	} else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
+	      S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
 		inode->i_op = &ext4_special_inode_operations;
 		if (raw_inode->i_block[0])
 			init_special_inode(inode, inode->i_mode,
@@ -4375,6 +4376,13 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 		else
 			init_special_inode(inode, inode->i_mode,
 			   new_decode_dev(le32_to_cpu(raw_inode->i_block[1])));
+	} else {
+		brelse(bh);
+		ret = -EIO;
+		ext4_error(inode->i_sb, __func__, 
+			   "bogus i_mode (%o) for inode=%lu",
+			   inode->i_mode, inode->i_ino);
+		goto bad_inode;
 	}
 	brelse(iloc.bh);
 	ext4_set_inode_flags(inode);
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 11/18] jbd2: Update locking coments
  2009-06-02 12:07                 ` [PATCH,STABLE 2.6.29 10/18] ext4: Check for an valid i_mode when reading the inode from disk Theodore Ts'o
@ 2009-06-02 12:07                   ` Theodore Ts'o
  2009-06-02 12:07                     ` [PATCH,STABLE 2.6.29 12/18] ext4: really print the find_group_flex fallback warning only once Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Jan Kara, Theodore Ts'o

From: Jan Kara <jack@suse.cz>

Update information about locking in JBD2 revoke code. Inconsistency in
comments found by Lin Tan <tammy000@gmail.com>.

CC: Lin Tan <tammy000@gmail.com>.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 86db97c87f744364d5889ca8a4134ca2048b8f83)
---
 fs/jbd2/revoke.c |   24 +++++++++++++++++++-----
 1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index 257ff26..bbe6d59 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -55,6 +55,25 @@
  *			need do nothing.
  * RevokeValid set, Revoked set:
  *			buffer has been revoked.
+ *
+ * Locking rules:
+ * We keep two hash tables of revoke records. One hashtable belongs to the
+ * running transaction (is pointed to by journal->j_revoke), the other one
+ * belongs to the committing transaction. Accesses to the second hash table
+ * happen only from the kjournald and no other thread touches this table.  Also
+ * journal_switch_revoke_table() which switches which hashtable belongs to the
+ * running and which to the committing transaction is called only from
+ * kjournald. Therefore we need no locks when accessing the hashtable belonging
+ * to the committing transaction.
+ *
+ * All users operating on the hash table belonging to the running transaction
+ * have a handle to the transaction. Therefore they are safe from kjournald
+ * switching hash tables under them. For operations on the lists of entries in
+ * the hash table j_revoke_lock is used.
+ *
+ * Finally, also replay code uses the hash tables but at this moment noone else
+ * can touch them (filesystem isn't mounted yet) and hence no locking is
+ * needed.
  */
 
 #ifndef __KERNEL__
@@ -401,8 +420,6 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
  * the second time we would still have a pending revoke to cancel.  So,
  * do not trust the Revoked bit on buffers unless RevokeValid is also
  * set.
- *
- * The caller must have the journal locked.
  */
 int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh)
 {
@@ -480,10 +497,7 @@ void jbd2_journal_switch_revoke_table(journal_t *journal)
 /*
  * Write revoke records to the journal for all entries in the current
  * revoke hash, deleting the entries as we go.
- *
- * Called with the journal lock held.
  */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 12/18] ext4: really print the find_group_flex fallback warning only once
  2009-06-02 12:07                   ` [PATCH,STABLE 2.6.29 11/18] jbd2: Update locking coments Theodore Ts'o
@ 2009-06-02 12:07                     ` Theodore Ts'o
  2009-06-02 12:07                       ` [PATCH,STABLE 2.6.29 13/18] ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Chuck Ebbert, Theodore Ts'o

From: Chuck Ebbert <cebbert@redhat.com>

Missing braces caused the warning to print more than once.

Signed-Off-By: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 6b82f3cb2d480b7714eb0ff61aee99c22160389e)
---
 fs/ext4/ialloc.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index befd95e..345cba1 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -720,11 +720,12 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, int mode)
 		ret2 = find_group_flex(sb, dir, &group);
 		if (ret2 == -1) {
 			ret2 = find_group_other(sb, dir, &group);
-			if (ret2 == 0 && once)
+			if (ret2 == 0 && once) {
 				once = 0;
 				printk(KERN_NOTICE "ext4: find_group_flex "
 				       "failed, fallback succeeded dir %lu\n",
 				       dir->i_ino);
+			}
 		}
 		goto got_group;
 	}
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 13/18] ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode
  2009-06-02 12:07                     ` [PATCH,STABLE 2.6.29 12/18] ext4: really print the find_group_flex fallback warning only once Theodore Ts'o
@ 2009-06-02 12:07                       ` Theodore Ts'o
  2009-06-02 12:07                         ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Theodore Ts'o
  2009-06-03 18:16                         ` Fix softlockup caused by illegal i_file_acl value in on-disk inode Andreas Dilger
  0 siblings, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

If the block containing external extended attributes (which is stored
in i_file_acl and i_file_acl_high) is larger than the on-disk
filesystem, the process which tried to access the extended attributes
will endlessly issue kernel printks complaining that
"__find_get_block_slow() failed", locking up that CPU until the system
is forcibly rebooted.

So when we read in the inode, make sure the i_file_acl value is legal,
and if not, flag the filesystem as being corrupted.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 485c26ec70f823f2a9cf45982b724893e53a859e)
---
 fs/ext4/inode.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c4f0e14..ec3457b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4351,6 +4351,18 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 			(__u64)(le32_to_cpu(raw_inode->i_version_hi)) << 32;
 	}
 
+	if (ei->i_file_acl &&
+	    ((ei->i_file_acl < 
+	      (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
+	       EXT4_SB(sb)->s_gdb_count)) ||
+	     (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) {
+		ext4_error(sb, __func__,
+			   "bad extended attribute block %llu in inode #%lu",
+			   ei->i_file_acl, inode->i_ino);
+		ret = -EIO;
+		goto bad_inode;
+	}
+
 	if (S_ISREG(inode->i_mode)) {
 		inode->i_op = &ext4_file_inode_operations;
 		inode->i_fop = &ext4_file_operations;
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present
  2009-06-02 12:07                       ` [PATCH,STABLE 2.6.29 13/18] ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode Theodore Ts'o
@ 2009-06-02 12:07                         ` Theodore Ts'o
  2009-06-02 12:07                           ` [PATCH,STABLE 2.6.29 15/18] ext4: Fix sub-block zeroing for writes into preallocated extents Theodore Ts'o
  2009-06-03 18:17                           ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Andreas Dilger
  2009-06-03 18:16                         ` Fix softlockup caused by illegal i_file_acl value in on-disk inode Andreas Dilger
  1 sibling, 2 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

Don't try to look at i_file_acl_high unless the INCOMPAT_64BIT feature
bit is set.  The field is normally zero, but older versions of e2fsck
didn't automatically check to make sure of this, so in the spirit of
"be liberal in what you accept", don't look at i_file_acl_high unless
we are using a 64-bit filesystem.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

(cherry picked from commit a9e817425dc0baede8ebe5fbc9984a640257432b)
---
 fs/ext4/inode.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ec3457b..cf65a83 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4300,11 +4300,9 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
 	inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
 	ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
-	if (EXT4_SB(inode->i_sb)->s_es->s_creator_os !=
-	    cpu_to_le32(EXT4_OS_HURD)) {
+	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_64BIT))
 		ei->i_file_acl |=
 			((__u64)le16_to_cpu(raw_inode->i_file_acl_high)) << 32;
-	}
 	inode->i_size = ext4_isize(raw_inode);
 	ei->i_disksize = inode->i_size;
 	inode->i_generation = le32_to_cpu(raw_inode->i_generation);
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 15/18] ext4: Fix sub-block zeroing for writes into preallocated extents
  2009-06-02 12:07                         ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Theodore Ts'o
@ 2009-06-02 12:07                           ` Theodore Ts'o
  2009-06-02 12:07                             ` [PATCH,STABLE 2.6.29 16/18] ext4: Use a fake block number for delayed new buffer_head Theodore Ts'o
  2009-06-03 18:17                           ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Andreas Dilger
  1 sibling, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Aneesh Kumar K.V, Theodore Ts'o

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

We need to mark the buffer_head mapping preallocated space as new
during write_begin. Otherwise we don't zero out the page cache content
properly for a partial write. This will cause file corruption with
preallocation.

Now that we mark the buffer_head new we also need to have a valid
buffer_head blocknr so that unmap_underlying_metadata() unmaps the
correct block.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 9c1ee184a30394e54165fa4c15923cabd952c106)
---
 fs/ext4/extents.c |    2 ++
 fs/ext4/inode.c   |    7 +++++++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index e0aa4fe..6af5a50 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2776,6 +2776,8 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 				if (allocated > max_blocks)
 					allocated = max_blocks;
 				set_buffer_unwritten(bh_result);
+				bh_result->b_bdev = inode->i_sb->s_bdev;
+				bh_result->b_blocknr = newblock;
 				goto out2;
 			}
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index cf65a83..2caeda7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2246,6 +2246,13 @@ static int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 		set_buffer_delay(bh_result);
 	} else if (ret > 0) {
 		bh_result->b_size = (ret << inode->i_blkbits);
+		/*
+		 * With sub-block writes into unwritten extents
+		 * we also need to mark the buffer as new so that
+		 * the unwritten parts of the buffer gets correctly zeroed.
+		 */
+		if (buffer_unwritten(bh_result))
+			set_buffer_new(bh_result);
 		ret = 0;
 	}
 
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 16/18] ext4: Use a fake block number for delayed new buffer_head
  2009-06-02 12:07                           ` [PATCH,STABLE 2.6.29 15/18] ext4: Fix sub-block zeroing for writes into preallocated extents Theodore Ts'o
@ 2009-06-02 12:07                             ` Theodore Ts'o
  2009-06-02 12:07                               ` [PATCH,STABLE 2.6.29 17/18] ext4: Clear the unwritten buffer_head flag after the extent is initialized Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Aneesh Kumar K.V, Theodore Ts'o

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Use a very large unsigned number (~0xffff) as as the fake block number
for the delayed new buffer. The VFS should never try to write out this
number, but if it does, this will make it obvious.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 33b9817e2ae097c7b8d256e3510ac6c54fc6d9d0)
---
 fs/ext4/inode.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2caeda7..4ed5e92 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2220,6 +2220,10 @@ static int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 				  struct buffer_head *bh_result, int create)
 {
 	int ret = 0;
+	sector_t invalid_block = ~((sector_t) 0xffff);
+
+	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
+		invalid_block = ~0;
 
 	BUG_ON(create == 0);
 	BUG_ON(bh_result->b_size != inode->i_sb->s_blocksize);
@@ -2241,7 +2245,7 @@ static int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 			/* not enough space to reserve */
 			return ret;
 
-		map_bh(bh_result, inode->i_sb, 0);
+		map_bh(bh_result, inode->i_sb, invalid_block);
 		set_buffer_new(bh_result);
 		set_buffer_delay(bh_result);
 	} else if (ret > 0) {
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 17/18] ext4: Clear the unwritten buffer_head flag after the extent is initialized
  2009-06-02 12:07                             ` [PATCH,STABLE 2.6.29 16/18] ext4: Use a fake block number for delayed new buffer_head Theodore Ts'o
@ 2009-06-02 12:07                               ` Theodore Ts'o
  2009-06-02 12:07                                 ` [PATCH,STABLE 2.6.29 18/18] ext4: Fix race in ext4_inode_info.i_cached_extent Theodore Ts'o
  0 siblings, 1 reply; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Aneesh Kumar K.V, Theodore Ts'o

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

The BH_Unwritten flag indicates that the buffer is allocated on disk
but has not been written; that is, the disk was part of a persistent
preallocation area.  That flag should only be set when a get_blocks()
function is looking up a inode's logical to physical block mapping.

When ext4_get_blocks_wrap() is called with create=1, the uninitialized
extent is converted into an initialized one, so the BH_Unwritten flag
is no longer appropriate.  Hence, we need to make sure the
BH_Unwritten is not left set, since the combination of BH_Mapped and
BH_Unwritten is not allowed; among other things, it will result ext4's
get_block() to be called over and over again during the write_begin
phase of write(2).

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
(cherry picked from commit 2a8964d63d50dd2d65d71d342bc7fb6ef4117614)
---
 fs/ext4/inode.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4ed5e92..b3d7250 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1076,6 +1076,7 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block,
 	int retval;
 
 	clear_buffer_mapped(bh);
+	clear_buffer_unwritten(bh);
 
 	/*
 	 * Try to see if we can get  the block without requesting
@@ -1106,6 +1107,18 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block,
 		return retval;
 
 	/*
+	 * When we call get_blocks without the create flag, the
+	 * BH_Unwritten flag could have gotten set if the blocks
+	 * requested were part of a uninitialized extent.  We need to
+	 * clear this flag now that we are committed to convert all or
+	 * part of the uninitialized extent to be an initialized
+	 * extent.  This is because we need to avoid the combination
+	 * of BH_Unwritten and BH_Mapped flags being simultaneously
+	 * set on the buffer_head.
+	 */
+	clear_buffer_unwritten(bh);
+
+	/*
 	 * New blocks allocate and/or writing to uninitialized extent
 	 * will possibly result in updating i_data, so we take
 	 * the write lock of i_data_sem, and call get_blocks()
-- 
1.6.3.1.1.g75fc.dirty


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH,STABLE 2.6.29 18/18] ext4: Fix race in ext4_inode_info.i_cached_extent
  2009-06-02 12:07                               ` [PATCH,STABLE 2.6.29 17/18] ext4: Clear the unwritten buffer_head flag after the extent is initialized Theodore Ts'o
@ 2009-06-02 12:07                                 ` Theodore Ts'o
  0 siblings, 0 replies; 26+ messages in thread
From: Theodore Ts'o @ 2009-06-02 12:07 UTC (permalink / raw)
  To: stable; +Cc: linux-ext4, Theodore Ts'o

If two CPU's simultaneously call ext4_ext_get_blocks() at the same
time, there is nothing protecting the i_cached_extent structure from
being used and updated at the same time.  This could potentially cause
the wrong location on disk to be read or written to, including
potentially causing the corruption of the block group descriptors
and/or inode table.

This bug has been in the ext4 code since almost the very beginning of
ext4's development.  Fortunately once the data is stored in the page
cache cache, ext4_get_blocks() doesn't need to be called, so trying to
replicate this problem to the point where we could identify its root
cause was *extremely* difficult.  Many thanks to Kevin Shanahan for
working over several months to be able to reproduce this easily so we
could finally nail down the cause of the corruption.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 2ec0ae3acec47f628179ee95fe2c4da01b5e9fc4)
---
 fs/ext4/extents.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 6af5a50..d315c97 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1740,11 +1740,13 @@ ext4_ext_put_in_cache(struct inode *inode, ext4_lblk_t block,
 {
 	struct ext4_ext_cache *cex;
 	BUG_ON(len == 0);
+	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	cex = &EXT4_I(inode)->i_cached_extent;
 	cex->ec_type = type;
 	cex->ec_block = block;
 	cex->ec_len = len;
 	cex->ec_start = start;
+	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 }
 
 /*
@@ -1801,12 +1803,17 @@ ext4_ext_in_cache(struct inode *inode, ext4_lblk_t block,
 			struct ext4_extent *ex)
 {
 	struct ext4_ext_cache *cex;
+	int ret = EXT4_EXT_CACHE_NO;
 
+	/* 
+	 * We borrow i_block_reservation_lock to protect i_cached_extent
+	 */
+	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	cex = &EXT4_I(inode)->i_cached_extent;
 
 	/* has cache valid data? */
 	if (cex->ec_type == EXT4_EXT_CACHE_NO)
-		return EXT4_EXT_CACHE_NO;
+		goto errout;
 
 	BUG_ON(cex->ec_type != EXT4_EXT_CACHE_GAP &&
 			cex->ec_type != EXT4_EXT_CACHE_EXTENT);
@@ -1817,11 +1824,11 @@ ext4_ext_in_cache(struct inode *inode, ext4_lblk_t block,
 		ext_debug("%u cached by %u:%u:%llu\n",
 				block,
 				cex->ec_block, cex->ec_len, cex->ec_start);
-		return cex->ec_type;
+		ret = cex->ec_type;
 	}

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present
  2009-06-02 12:07                         ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Theodore Ts'o
  2009-06-02 12:07                           ` [PATCH,STABLE 2.6.29 15/18] ext4: Fix sub-block zeroing for writes into preallocated extents Theodore Ts'o
@ 2009-06-03 18:17                           ` Andreas Dilger
  1 sibling, 0 replies; 26+ messages in thread
From: Andreas Dilger @ 2009-06-03 18:17 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Jun 02, 2009  08:07 -0400, Theodore Ts'o wrote:
> Don't try to look at i_file_acl_high unless the INCOMPAT_64BIT feature
> bit is set.  The field is normally zero, but older versions of e2fsck
> didn't automatically check to make sure of this, so in the spirit of
> "be liberal in what you accept", don't look at i_file_acl_high unless
> we are using a 64-bit filesystem.

Should we do the same with other "_hi" fields in the inode?  There are
many cases like this for EXT4_DESC_SIZE(sb) >= EXT4_MIN_DESC_SIZE_64BIT
in super.c.  Does e2fsck check and zero those already?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Fix softlockup caused by illegal i_file_acl value in on-disk inode
  2009-06-02 12:07                       ` [PATCH,STABLE 2.6.29 13/18] ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode Theodore Ts'o
  2009-06-02 12:07                         ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Theodore Ts'o
@ 2009-06-03 18:16                         ` Andreas Dilger
  2009-06-03 19:24                           ` Theodore Tso
  1 sibling, 1 reply; 26+ messages in thread
From: Andreas Dilger @ 2009-06-03 18:16 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Jun 02, 2009  08:07 -0400, Theodore Ts'o wrote:
> +	if (ei->i_file_acl &&
> +	    ((ei->i_file_acl < 
> +	      (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
> +	       EXT4_SB(sb)->s_gdb_count)) ||
> +	     (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) {

I was just thinking it might make sense to wrap this check into a helper
like the following.  We check the validity of blocks in at least half a
dozen different places.  The elaborate ext4_blocktype is to allow for
future expansion of this checking mechanism to allow it to check for
blocks overlapping with e.g. the inode table and such, and possibly for
using with the jbd2 buffer checksum mechanism at some later date.

enum ext4_blocktype {
	EXT4_BT_SUPERBLOCK	=  1,
	EXT4_BT_GDT		=  2,
	EXT4_BT_INODE_BITMAP	=  3,
	EXT4_BT_BLOCK_BITMAP	=  4,
	EXT4_BT_INODE_TABLE	=  5,
	EXT4_BT_DIRECTORY_ROOT  = 10,
	EXT4_BT_DIRECTORY_LEAF  = 11,
	EXT4_BT_DIRECTORY_HTREE = 12,
	EXT4_BT_INDIRECT	= 21,
	EXT4_BT_DINDIRECT	= 22,
	EXT4_BT_TINDIRECT	= 23,
	EXT4_BT_EXTENT_INDEX    = 25,
	EXT4_BT_EXTENT_LEAF	= 26,
	EXT4_BT_DATA_BLOCK	= 30,
	EXT4_BT_ACL_BLOCK	= 31,
};

bool ext4_block_valid(ext4_blk_t block, enum blocktype)
{
	if (block < le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
			EXT4_SB(sb)->s_gdb_count)) ||
	    block >= ext4_blocks_count(EXT4_SB(sb)->s_es)
		return 0;
	
	return 1;
}

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Fix softlockup caused by illegal i_file_acl value in on-disk inode
  2009-06-03 18:16                         ` Fix softlockup caused by illegal i_file_acl value in on-disk inode Andreas Dilger
@ 2009-06-03 19:24                           ` Theodore Tso
  0 siblings, 0 replies; 26+ messages in thread
From: Theodore Tso @ 2009-06-03 19:24 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

On Wed, Jun 03, 2009 at 12:16:08PM -0600, Andreas Dilger wrote:
> On Jun 02, 2009  08:07 -0400, Theodore Ts'o wrote:
> > +	if (ei->i_file_acl &&
> > +	    ((ei->i_file_acl < 
> > +	      (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
> > +	       EXT4_SB(sb)->s_gdb_count)) ||
> > +	     (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) {
> 
> I was just thinking it might make sense to wrap this check into a helper
> like the following.  We check the validity of blocks in at least half a
> dozen different places.  The elaborate ext4_blocktype is to allow for
> future expansion of this checking mechanism to allow it to check for
> blocks overlapping with e.g. the inode table and such, and possibly for
> using with the jbd2 buffer checksum mechanism at some later date.

We do have a helper function that is waiting to be merged in the patch
queue.  See the patch "add-check-block-validity-to-ext4_get_blocks_wrap".  

It doesn't have the blocktype extension, since to keep things fast and
simple, I have a single red-black tree for any blocks that shouldn't
be used for file blocks allows for a *much* more compact
representation in the red-black tree, thanks to flex_bg putting the
block and inode bitmaps and inode tables back-to-back with each other.
If I were to add blocktype information to the red-black tree that
ext4_data_block_valid() could check against, the red-black tree would
at least triple in size.

The nice thing about this patch (which will be merged for 2.6.31) is
that it's a runtime mount option.  So if we have a customer that runs
into problems, we don't have to ship them a custom debugging kernel;
we just tell them to mount the filesystem with block_validity, and we
can start debugging the problem right away.

					- Ted

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close
  2009-06-02 12:07         ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Theodore Ts'o
  2009-06-02 12:07           ` [PATCH,STABLE 2.6.29 07/18] ext4: Automatically allocate delay allocated blocks on rename Theodore Ts'o
@ 2009-06-03 18:14           ` Andreas Dilger
  2009-06-03 19:29             ` Theodore Tso
  1 sibling, 1 reply; 26+ messages in thread
From: Andreas Dilger @ 2009-06-03 18:14 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, stable

On Jun 02, 2009  08:07 -0400, Theodore Ts'o wrote:
> When closing a file that had been previously truncated, force any
> delay allocated blocks that to be allocated so that if the filesystem
> is mounted with data=ordered, the data blocks will be pushed out to
> disk along with the journal commit.  Many application programs expect
> this, so we do this to avoid zero length files if the system crashes
> unexpectedly.
> 
> @@ -3880,6 +3880,9 @@ void ext4_truncate(struct inode *inode)
>  	if (!ext4_can_truncate(inode))
>  		return;
>  
> +	if (inode->i_size == 0)
> +		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;

Since some applications open files with open(..., O_WRONLY|O_CREAT|O_TRUNC)
to avoid re-using existing files (and avoiding the need to check if the
file already exists to modify the flags), it would make sense to set
EXT4_STATE_DA_ALLOC_CLOSE only if the file previously had some data in it.

By the time we get to ext4_truncate() i_size is overwritten already, but
it might make sense to also check i_disksize != 0 before setting this flag.
Otherwise delayed allocation may be inadvertently disabled for these apps
when it should not be.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close
  2009-06-03 18:14           ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Andreas Dilger
@ 2009-06-03 19:29             ` Theodore Tso
  0 siblings, 0 replies; 26+ messages in thread
From: Theodore Tso @ 2009-06-03 19:29 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4, stable

On Wed, Jun 03, 2009 at 12:14:19PM -0600, Andreas Dilger wrote:
> 
> Since some applications open files with open(..., O_WRONLY|O_CREAT|O_TRUNC)
> to avoid re-using existing files (and avoiding the need to check if the
> file already exists to modify the flags), it would make sense to set
> EXT4_STATE_DA_ALLOC_CLOSE only if the file previously had some data in it.
> 
> By the time we get to ext4_truncate() i_size is overwritten already, but
> it might make sense to also check i_disksize != 0 before setting this flag.
> Otherwise delayed allocation may be inadvertently disabled for these apps
> when it should not be.

Agreed; I'll make such a change for the ext4 patch queue.  We can
propagate such a patch to the -stable kernels once it's in mainline.

	       	       	      	      	      - Ted

^ permalink raw reply	[flat|nested] 26+ messages in thread

* patch ext4-return-eio-not-estale-on-directory-traversal-through-deleted-inode.patch added to 2.6.29-stable tree
  2009-06-02 12:07   ` [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode Theodore Ts'o
  2009-06-02 12:07     ` [PATCH,STABLE 2.6.29 04/18] ext4: Add fine print for the 32000 subdirectory limit Theodore Ts'o
@ 2009-06-09  9:33     ` gregkh
  1 sibling, 0 replies; 26+ messages in thread
From: gregkh @ 2009-06-09  9:33 UTC (permalink / raw)
  To: bdonlan, akpm, gregkh, linux-ext4, tytso; +Cc: stable, stable-commits


This is a note to let you know that we have just queued up the patch titled

    Subject: ext4: return -EIO not -ESTALE on directory traversal through deleted inode

to the 2.6.29-stable tree.  Its filename is

    ext4-return-eio-not-estale-on-directory-traversal-through-deleted-inode.patch

A git repo of this tree can be found at 
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary


>From stable-bounces@linux.kernel.org  Tue Jun  9 02:25:08 2009
From: Bryan Donlan <bdonlan@gmail.com>
Date: Tue,  2 Jun 2009 08:07:44 -0400
Subject: ext4: return -EIO not -ESTALE on directory traversal through deleted inode
To: stable@kernel.org
Cc: "Theodore Ts'o" <tytso@mit.edu>, Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org, Bryan Donlan <bdonlan@gmail.com>
Message-ID: <1243944479-20574-3-git-send-email-tytso@mit.edu>


From: Bryan Donlan <bdonlan@gmail.com>

(cherry picked from commit e6f009b0b45220c004672d41a58865e94946104d)

ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly.  However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode.  This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.

The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.

This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.

Signed-off-by: Bryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/ext4/namei.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1052,8 +1052,16 @@ static struct dentry *ext4_lookup(struct
 			return ERR_PTR(-EIO);
 		}
 		inode = ext4_iget(dir->i_sb, ino);
-		if (IS_ERR(inode))
-			return ERR_CAST(inode);
+		if (unlikely(IS_ERR(inode))) {
+			if (PTR_ERR(inode) == -ESTALE) {
+				ext4_error(dir->i_sb, __func__,
+						"deleted inode referenced: %u",
+						ino);
+				return ERR_PTR(-EIO);
+			} else {
+				return ERR_CAST(inode);
+			}
+		}
 	}
 	return d_splice_alias(inode, dentry);
 }


Patches currently in stable-queue which might be from bdonlan@gmail.com are


^ permalink raw reply	[flat|nested] 26+ messages in thread

* patch ext4-tighten-restrictions-on-inode-flags.patch added to 2.6.29-stable tree
  2009-06-02 12:07 ` [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags Theodore Ts'o
  2009-06-02 12:07   ` [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode Theodore Ts'o
@ 2009-06-09  9:33   ` gregkh
  1 sibling, 0 replies; 26+ messages in thread
From: gregkh @ 2009-06-09  9:33 UTC (permalink / raw)
  To: duaneg, adilger, akpm, gregkh, linux-ext4, tytso; +Cc: stable, stable-commits


This is a note to let you know that we have just queued up the patch titled

    Subject: ext4: tighten restrictions on inode flags

to the 2.6.29-stable tree.  Its filename is

    ext4-tighten-restrictions-on-inode-flags.patch

A git repo of this tree can be found at 
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary


>From stable-bounces@linux.kernel.org  Tue Jun  9 02:24:29 2009
From: Duane Griffin <duaneg@dghda.com>
Date: Tue,  2 Jun 2009 08:07:43 -0400
Subject: ext4: tighten restrictions on inode flags
To: stable@kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org, "Theodore Ts'o" <tytso@mit.edu>, Duane Griffin <duaneg@dghda.com>
Message-ID: <1243944479-20574-2-git-send-email-tytso@mit.edu>


From: Duane Griffin <duaneg@dghda.com>

(cherry picked from commit 2dc6b0d48ca0599837df21b14bb8393d0804af57)

At the moment there are few restrictions on which flags may be set on
which inodes.  Specifically DIRSYNC may only be set on directories and
IMMUTABLE and APPEND may not be set on links.  Tighten that to disallow
TOPDIR being set on non-directories and only NODUMP and NOATIME to be set
on non-regular file, non-directories.

Introduces a flags masking function which masks flags based on mode and
use it during inode creation and when flags are set via the ioctl to
facilitate future consistency.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/ext4/ext4.h   |   17 +++++++++++++++++
 fs/ext4/ialloc.c |   14 +++++---------
 fs/ext4/ioctl.c  |    3 +--
 3 files changed, 23 insertions(+), 11 deletions(-)

--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -255,6 +255,23 @@ struct flex_groups {
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
 
+/* Flags that are appropriate for regular files (all but dir-specific ones). */
+#define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL))
+
+/* Flags that are appropriate for non-directories/regular files. */
+#define EXT4_OTHER_FLMASK (EXT4_NODUMP_FL | EXT4_NOATIME_FL)
+
+/* Mask out flags that are inappropriate for the given type of inode. */
+static inline __u32 ext4_mask_flags(umode_t mode, __u32 flags)
+{
+	if (S_ISDIR(mode))
+		return flags;
+	else if (S_ISREG(mode))
+		return flags & EXT4_REG_FLMASK;
+	else
+		return flags & EXT4_OTHER_FLMASK;
+}
+
 /*
  * Inode dynamic state flags
  */
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -885,16 +885,12 @@ got:
 	ei->i_disksize = 0;
 
 	/*
-	 * Don't inherit extent flag from directory. We set extent flag on
-	 * newly created directory and file only if -o extent mount option is
-	 * specified
+	 * Don't inherit extent flag from directory, amongst others. We set
+	 * extent flag on newly created directory and file only if -o extent
+	 * mount option is specified
 	 */
-	ei->i_flags = EXT4_I(dir)->i_flags & EXT4_FL_INHERITED;
-	if (S_ISLNK(mode))
-		ei->i_flags &= ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL);
-	/* dirsync only applies to directories */
-	if (!S_ISDIR(mode))
-		ei->i_flags &= ~EXT4_DIRSYNC_FL;
+	ei->i_flags =
+		ext4_mask_flags(mode, EXT4_I(dir)->i_flags & EXT4_FL_INHERITED);
 	ei->i_file_acl = 0;
 	ei->i_dtime = 0;
 	ei->i_block_group = group;
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -48,8 +48,7 @@ long ext4_ioctl(struct file *filp, unsig
 		if (err)
 			return err;
 
-		if (!S_ISDIR(inode->i_mode))
-			flags &= ~EXT4_DIRSYNC_FL;
+		flags = ext4_mask_flags(inode->i_mode, flags);
 
 		err = -EPERM;
 		mutex_lock(&inode->i_mutex);


Patches currently in stable-queue which might be from duaneg@dghda.com are


^ permalink raw reply	[flat|nested] 26+ messages in thread

* patch ext4-don-t-inherit-inappropriate-inode-flags-from-parent.patch added to 2.6.29-stable tree
  2009-06-02 12:07 [PATCH,STABLE 2.6.29 01/18] ext4: don't inherit inappropriate inode flags from parent Theodore Ts'o
  2009-06-02 12:07 ` [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags Theodore Ts'o
@ 2009-06-09  9:33 ` gregkh
  1 sibling, 0 replies; 26+ messages in thread
From: gregkh @ 2009-06-09  9:33 UTC (permalink / raw)
  To: duaneg, adilger, akpm, gregkh, linux-ext4, tytso; +Cc: stable, stable-commits


This is a note to let you know that we have just queued up the patch titled

    Subject: ext4: don't inherit inappropriate inode flags from parent

to the 2.6.29-stable tree.  Its filename is

    ext4-don-t-inherit-inappropriate-inode-flags-from-parent.patch

A git repo of this tree can be found at 
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary


>From stable-bounces@linux.kernel.org  Tue Jun  9 02:24:00 2009
From: Duane Griffin <duaneg@dghda.com>
Date: Tue,  2 Jun 2009 08:07:42 -0400
Subject: ext4: don't inherit inappropriate inode flags from parent
To: stable@kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org, "Theodore Ts'o" <tytso@mit.edu>, Duane Griffin <duaneg@dghda.com>
Message-ID: <1243944479-20574-1-git-send-email-tytso@mit.edu>


From: Duane Griffin <duaneg@dghda.com>

(cherry picked from commit 8fa43a81b97853fc69417bb6054182e78f95cbeb)

At present INDEX and EXTENTS are the only flags that new ext4 inodes do
NOT inherit from their parent.  In addition prevent the flags DIRTY,
ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited.
List inheritable flags explicitly to prevent future flags from
accidentally being inherited.

This fixes the TOPDIR flag inheritance bug reported at
http://bugzilla.kernel.org/show_bug.cgi?id=9866.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/ext4/ext4.h   |    7 +++++++
 fs/ext4/ialloc.c |    2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -248,6 +248,13 @@ struct flex_groups {
 #define EXT4_FL_USER_VISIBLE		0x000BDFFF /* User visible flags */
 #define EXT4_FL_USER_MODIFIABLE		0x000B80FF /* User modifiable flags */
 
+/* Flags that should be inherited by new inodes from their parent. */
+#define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
+			   EXT4_SYNC_FL | EXT4_IMMUTABLE_FL | EXT4_APPEND_FL |\
+			   EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
+			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
+			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
+
 /*
  * Inode dynamic state flags
  */
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -889,7 +889,7 @@ got:
 	 * newly created directory and file only if -o extent mount option is
 	 * specified
 	 */
-	ei->i_flags = EXT4_I(dir)->i_flags & ~(EXT4_INDEX_FL|EXT4_EXTENTS_FL);
+	ei->i_flags = EXT4_I(dir)->i_flags & EXT4_FL_INHERITED;
 	if (S_ISLNK(mode))
 		ei->i_flags &= ~(EXT4_IMMUTABLE_FL|EXT4_APPEND_FL);
 	/* dirsync only applies to directories */


Patches currently in stable-queue which might be from duaneg@dghda.com are


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2009-06-09  9:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-02 12:07 [PATCH,STABLE 2.6.29 01/18] ext4: don't inherit inappropriate inode flags from parent Theodore Ts'o
2009-06-02 12:07 ` [PATCH,STABLE 2.6.29 02/18] ext4: tighten restrictions on inode flags Theodore Ts'o
2009-06-02 12:07   ` [PATCH,STABLE 2.6.29 03/18] ext4: return -EIO not -ESTALE on directory traversal through deleted inode Theodore Ts'o
2009-06-02 12:07     ` [PATCH,STABLE 2.6.29 04/18] ext4: Add fine print for the 32000 subdirectory limit Theodore Ts'o
2009-06-02 12:07       ` [PATCH,STABLE 2.6.29 05/18] ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl Theodore Ts'o
2009-06-02 12:07         ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Theodore Ts'o
2009-06-02 12:07           ` [PATCH,STABLE 2.6.29 07/18] ext4: Automatically allocate delay allocated blocks on rename Theodore Ts'o
2009-06-02 12:07             ` [PATCH,STABLE 2.6.29 08/18] ext4: Fix discard of inode prealloc space with delayed allocation Theodore Ts'o
2009-06-02 12:07               ` [PATCH,STABLE 2.6.29 09/18] ext4: Add auto_da_alloc mount option Theodore Ts'o
2009-06-02 12:07                 ` [PATCH,STABLE 2.6.29 10/18] ext4: Check for an valid i_mode when reading the inode from disk Theodore Ts'o
2009-06-02 12:07                   ` [PATCH,STABLE 2.6.29 11/18] jbd2: Update locking coments Theodore Ts'o
2009-06-02 12:07                     ` [PATCH,STABLE 2.6.29 12/18] ext4: really print the find_group_flex fallback warning only once Theodore Ts'o
2009-06-02 12:07                       ` [PATCH,STABLE 2.6.29 13/18] ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode Theodore Ts'o
2009-06-02 12:07                         ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Theodore Ts'o
2009-06-02 12:07                           ` [PATCH,STABLE 2.6.29 15/18] ext4: Fix sub-block zeroing for writes into preallocated extents Theodore Ts'o
2009-06-02 12:07                             ` [PATCH,STABLE 2.6.29 16/18] ext4: Use a fake block number for delayed new buffer_head Theodore Ts'o
2009-06-02 12:07                               ` [PATCH,STABLE 2.6.29 17/18] ext4: Clear the unwritten buffer_head flag after the extent is initialized Theodore Ts'o
2009-06-02 12:07                                 ` [PATCH,STABLE 2.6.29 18/18] ext4: Fix race in ext4_inode_info.i_cached_extent Theodore Ts'o
2009-06-03 18:17                           ` [PATCH,STABLE 2.6.29 14/18] ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present Andreas Dilger
2009-06-03 18:16                         ` Fix softlockup caused by illegal i_file_acl value in on-disk inode Andreas Dilger
2009-06-03 19:24                           ` Theodore Tso
2009-06-03 18:14           ` [PATCH,STABLE 2.6.29 06/18] ext4: Automatically allocate delay allocated blocks on close Andreas Dilger
2009-06-03 19:29             ` Theodore Tso
2009-06-09  9:33     ` patch ext4-return-eio-not-estale-on-directory-traversal-through-deleted-inode.patch added to 2.6.29-stable tree gregkh
2009-06-09  9:33   ` patch ext4-tighten-restrictions-on-inode-flags.patch " gregkh
2009-06-09  9:33 ` patch ext4-don-t-inherit-inappropriate-inode-flags-from-parent.patch " gregkh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).