[PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling
@ 2008-09-13 15:32 Theodore Ts'o
  2008-09-13 15:32 ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Theodore Ts'o
  2008-09-17 19:19 ` [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Andrew Morton
  0 siblings, 2 replies; 13+ messages in thread
From: Theodore Ts'o @ 2008-09-13 15:32 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, Theodore Ts'o, Eugene Dashevsky, Mike Snitzer,
	linux-ext4

This fixes a bug where readdir() would return a directory entry twice
if there was a hash collision in an hash tree indexed directory.

Signed-off-by: Eugene Dashevsky <eugene@ibrix.com>
Signed-off-by: Mike Snitzer <msnitzer@ibrix.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
---
 fs/ext3/dir.c |   22 ++++++++++++++++------
 1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c
index 2eea96e..42c5391 100644
--- a/fs/ext3/dir.c
+++ b/fs/ext3/dir.c
@@ -410,7 +410,7 @@ static int call_filldir(struct file * filp, void * dirent,
 				get_dtype(sb, fname->file_type));
 		if (error) {
 			filp->f_pos = curr_pos;
-			info->extra_fname = fname->next;
+			info->extra_fname = fname;
 			return error;
 		}
 		fname = fname->next;
@@ -449,11 +449,21 @@ static int ext3_dx_readdir(struct file * filp,
 	 * If there are any leftover names on the hash collision
 	 * chain, return them first.
 	 */
-	if (info->extra_fname &&
-	    call_filldir(filp, dirent, filldir, info->extra_fname))
-		goto finished;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin
  2008-09-13 15:32 [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Theodore Ts'o
@ 2008-09-13 15:32 ` Theodore Ts'o
  2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
  2008-09-17 19:22   ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Andrew Morton
  2008-09-17 19:19 ` [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Andrew Morton
  1 sibling, 2 replies; 13+ messages in thread
From: Theodore Ts'o @ 2008-09-13 15:32 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, Aneesh Kumar K.V, Theodore Ts'o, linux-ext4

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

For blocksize < pagesize we need to remove blocks that got allocated in
block_write_begin() if we fail with ENOSPC for later blocks.
block_write_begin() internally does this if it allocated page
locally. This makes sure we don't have blocks outside inode.i_size
during ENOSPC.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
---
 fs/ext3/inode.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 507d868..bff22b9 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1178,6 +1178,13 @@ write_begin_failed:
 		ext3_journal_stop(handle);
 		unlock_page(page);
 		page_cache_release(page);
+		/*
+		 * block_write_begin may have instantiated a few blocks
+		 * outside i_size.  Trim these off again. Don't need
+		 * i_size_read because we hold i_mutex.
+		 */
+		if (pos + len > inode->i_size)
+			vmtruncate(inode, inode->i_size);
 	}
 	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
 		goto retry;
-- 
1.5.6.1.205.ge2c7.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
  2008-09-13 15:32 ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Theodore Ts'o
@ 2008-09-13 15:32   ` Theodore Ts'o
  2008-09-13 15:32     ` [PATCH 4/4] ext3: " Theodore Ts'o
                       ` (2 more replies)
  2008-09-17 19:22   ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Andrew Morton
  1 sibling, 3 replies; 13+ messages in thread
From: Theodore Ts'o @ 2008-09-13 15:32 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, Theodore Ts'o, Eric Sandeen, linux-ext4,
	Eugene Teo

Note: some people thinks this represents a security bug, since it
might make the system go away while it is printing a large number of
console messages, especially if a serial console is involved.  Hence,
it has been assigned CVE-2008-3528, but it requires that the attacker
either has physical access to your machine to insert a USB disk with a
corrupted filesystem image (at which point why not just hit the power
button), or is otherwise able to convince the system administrator to
mount an arbitrary filesystem image (at which point why not just
include a setuid shell or world-writable hard disk device file or some
such).  Me, I think they're just being silly.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: Eugene Teo <eugeneteo@kernel.sg>
---
 fs/ext2/dir.c |   60 +++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c
index a78c6b4..c53c790 100644
--- a/fs/ext2/dir.c
+++ b/fs/ext2/dir.c
@@ -103,7 +103,7 @@ static int ext2_commit_chunk(struct page *page, loff_t pos, unsigned len)
 	return err;
 }
 
-static void ext2_check_page(struct page *page)
+static void ext2_check_page(struct page *page, int quiet)
 {
 	struct inode *dir = page->mapping->host;
 	struct super_block *sb = dir->i_sb;
@@ -146,10 +146,10 @@ out:
 	/* Too bad, we had an error */
 
 Ebadsize:
-	ext2_error(sb, "ext2_check_page",
-		"size of directory #%lu is not a multiple of chunk size",
-		dir->i_ino
-	);
+	if (!quiet)
+		ext2_error(sb, __func__,
+			"size of directory #%lu is not a multiple "
+			"of chunk size", dir->i_ino);
 	goto fail;
 Eshort:
 	error = "rec_len is smaller than minimal";
@@ -166,32 +166,36 @@ Espan:
 Einumber:
 	error = "inode out of bounds";
 bad_entry:
-	ext2_error (sb, "ext2_check_page", "bad entry in directory #%lu: %s - "
-		"offset=%lu, inode=%lu, rec_len=%d, name_len=%d",
-		dir->i_ino, error, (page->index<<PAGE_CACHE_SHIFT)+offs,
-		(unsigned long) le32_to_cpu(p->inode),
-		rec_len, p->name_len);
+	if (!quiet)
+		ext2_error(sb, __func__, "bad entry in directory #%lu: : %s - "
+			"offset=%lu, inode=%lu, rec_len=%d, name_len=%d",
+			dir->i_ino, error, (page->index<<PAGE_CACHE_SHIFT)+offs,
+			(unsigned long) le32_to_cpu(p->inode),
+			rec_len, p->name_len);
 	goto fail;
 Eend:
-	p = (ext2_dirent *)(kaddr + offs);
-	ext2_error (sb, "ext2_check_page",
-		"entry in directory #%lu spans the page boundary"
-		"offset=%lu, inode=%lu",
-		dir->i_ino, (page->index<<PAGE_CACHE_SHIFT)+offs,
-		(unsigned long) le32_to_cpu(p->inode));
+	if (!quiet) {
+		p = (ext2_dirent *)(kaddr + offs);
+		ext2_error (sb, "ext2_check_page",
+			"entry in directory #%lu spans the page boundary"
+			"offset=%lu, inode=%lu",
+			dir->i_ino, (page->index<<PAGE_CACHE_SHIFT)+offs,
+			(unsigned long) le32_to_cpu(p->inode));
+	}
 fail:
 	SetPageChecked(page);
 	SetPageError(page);
 }
 
-static struct page * ext2_get_page(struct inode *dir, unsigned long n)
+static struct page * ext2_get_page(struct inode *dir, unsigned long n,
+				   int quiet)
 {
 	struct address_space *mapping = dir->i_mapping;
 	struct page *page = read_mapping_page(mapping, n, NULL);
 	if (!IS_ERR(page)) {
 		kmap(page);
 		if (!PageChecked(page))
-			ext2_check_page(page);
+			ext2_check_page(page, quiet);
 		if (PageError(page))
 			goto fail;
 	}
@@ -292,7 +296,7 @@ ext2_readdir (struct file * filp, void * dirent, filldir_t filldir)
 	for ( ; n < npages; n++, offset = 0) {
 		char *kaddr, *limit;
 		ext2_dirent *de;
-		struct page *page = ext2_get_page(inode, n);
+		struct page *page = ext2_get_page(inode, n, 0);
 
 		if (IS_ERR(page)) {
 			ext2_error(sb, __func__,
@@ -361,6 +365,7 @@ struct ext2_dir_entry_2 * ext2_find_entry (struct inode * dir,
 	struct page *page = NULL;
 	struct ext2_inode_info *ei = EXT2_I(dir);
 	ext2_dirent * de;
+	int dir_has_error = 0;
 
 	if (npages == 0)
 		goto out;
@@ -374,7 +379,7 @@ struct ext2_dir_entry_2 * ext2_find_entry (struct inode * dir,
 	n = start;
 	do {
 		char *kaddr;
-		page = ext2_get_page(dir, n);
+		page = ext2_get_page(dir, n, dir_has_error);
 		if (!IS_ERR(page)) {
 			kaddr = page_address(page);
 			de = (ext2_dirent *) kaddr;
@@ -391,7 +396,9 @@ struct ext2_dir_entry_2 * ext2_find_entry (struct inode * dir,
 				de = ext2_next_entry(de);
 			}
 			ext2_put_page(page);
-		}
+		} else
+			dir_has_error = 1;
+
 		if (++n >= npages)
 			n = 0;
 		/* next page is past the blocks we've got */
@@ -414,7 +421,7 @@ found:
 
 struct ext2_dir_entry_2 * ext2_dotdot (struct inode *dir, struct page **p)
 {
-	struct page *page = ext2_get_page(dir, 0);
+	struct page *page = ext2_get_page(dir, 0, 0);
 	ext2_dirent *de = NULL;
 
 	if (!IS_ERR(page)) {
@@ -487,7 +494,7 @@ int ext2_add_link (struct dentry *dentry, struct inode *inode)
 	for (n = 0; n <= npages; n++) {
 		char *dir_end;
 
-		page = ext2_get_page(dir, n);
+		page = ext2_get_page(dir, n, 0);
 		err = PTR_ERR(page);
 		if (IS_ERR(page))
 			goto out;
@@ -655,14 +662,17 @@ int ext2_empty_dir (struct inode * inode)
 {
 	struct page *page = NULL;
 	unsigned long i, npages = dir_pages(inode);
+	int dir_has_error = 0;
 
 	for (i = 0; i < npages; i++) {
 		char *kaddr;
 		ext2_dirent * de;
-		page = ext2_get_page(inode, i);
+		page = ext2_get_page(inode, i, dir_has_error);
 
-		if (IS_ERR(page))
+		if (IS_ERR(page)) {
+			dir_has_error = 1;
 			continue;
+		}
 
 		kaddr = page_address(page);
 		de = (ext2_dirent *)kaddr;
-- 
1.5.6.1.205.ge2c7.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] ext3: Avoid printk floods in the face of directory corruption
  2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
@ 2008-09-13 15:32     ` Theodore Ts'o
  2008-09-18  0:57       ` Eugene Teo
  2008-09-17 19:25     ` [PATCH 3/4] ext2: " Andrew Morton
  2008-09-18  9:46     ` Eugene Teo
  2 siblings, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2008-09-13 15:32 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, Theodore Ts'o, Eric Sandeen, linux-ext4,
	Eugene Teo

Note: some people thinks this represents a security bug, since it
might make the system go away while it is printing a large number of
console messages, especially if a serial console is involved.  Hence,
it has been assigned CVE-2008-3528, but it requires that the attacker
either has physical access to your machine to insert a USB disk with a
corrupted filesystem image (at which point why not just hit the power
button), or is otherwise able to convince the system administrator to
mount an arbitrary filesystem image (at which point why not just
include a setuid shell or world-writable hard disk device file or some
such).  Me, I think they're just being silly.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: Eugene Teo <eugeneteo@kernel.sg>
---
 fs/ext3/dir.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c
index 42c5391..283938a 100644
--- a/fs/ext3/dir.c
+++ b/fs/ext3/dir.c
@@ -102,6 +102,7 @@ static int ext3_readdir(struct file * filp,
 	int err;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	int ret = 0;
+	int dir_has_error = 0;
 
 	sb = inode->i_sb;
 
@@ -148,9 +149,12 @@ static int ext3_readdir(struct file * filp,
 		 * of recovering data when there's a bad sector
 		 */
 		if (!bh) {
-			ext3_error (sb, "ext3_readdir",
-				"directory #%lu contains a hole at offset %lu",
-				inode->i_ino, (unsigned long)filp->f_pos);
+			if (!dir_has_error) {
+				ext3_error(sb, __func__, "directory #%lu "
+					"contains a hole at offset %lld",
+					inode->i_ino, filp->f_pos);
+				dir_has_error = 1;
+			}
 			/* corrupt size?  Maybe no more blocks to read */
 			if (filp->f_pos > inode->i_blocks << 9)
 				break;
-- 
1.5.6.1.205.ge2c7.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] ext3: Avoid printk floods in the face of directory corruption
  2008-09-13 15:32     ` [PATCH 4/4] ext3: " Theodore Ts'o
@ 2008-09-18  0:57       ` Eugene Teo
  0 siblings, 0 replies; 13+ messages in thread
From: Eugene Teo @ 2008-09-18  0:57 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: akpm, linux-kernel, Eric Sandeen, linux-ext4

On Sat, Sep 13, 2008 at 11:32 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> Note: some people thinks this represents a security bug, since it
> might make the system go away while it is printing a large number of
> console messages, especially if a serial console is involved.  Hence,
> it has been assigned CVE-2008-3528, but it requires that the attacker
> either has physical access to your machine to insert a USB disk with a
> corrupted filesystem image (at which point why not just hit the power
> button), or is otherwise able to convince the system administrator to
> mount an arbitrary filesystem image (at which point why not just
> include a setuid shell or world-writable hard disk device file or some
> such).  Me, I think they're just being silly.

The description should explain what the problem is. And the last
sentence is a little ambiguous.

This is a user-triggerable DoS. The administrator who mounted the
filesystem image or partition might not know that the dir->i_size and
dir->i_blocks are corrupted. A remote user just need to perform either
a read or write operation to the mounted image or partition, and this
could trigger the problem, resulting in a denial of service.

Take note that another problem the test image shows is that, the
ext2/3 (and possibly ext4) filesystem does not honour the read-only
mode when the revision level is too high. That is, when
le32_to_cpu(es->s_rev_level) > EXT3_MAX_SUPP_REV. Eric replied me in a
private email that this is a different, and unrelated bug that will be
fixed.

Thanks, Eugene

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
  2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
  2008-09-13 15:32     ` [PATCH 4/4] ext3: " Theodore Ts'o
@ 2008-09-17 19:25     ` Andrew Morton
  2008-09-17 19:30       ` Eric Sandeen
  2008-09-18  9:46     ` Eugene Teo
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2008-09-17 19:25 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-kernel, tytso, sandeen, linux-ext4, eugeneteo

On Sat, 13 Sep 2008 11:32:50 -0400
"Theodore Ts'o" <tytso@MIT.EDU> wrote:

> From: "Theodore Ts'o" <tytso@MIT.EDU>
> To: akpm@linux-foundation.org
> Cc: linux-kernel@vger.kernel.org, "Theodore Ts'o" <tytso@MIT.EDU>, Eric Sandeen <sandeen@redhat.com>, linux-ext4@vger.kernel.org, Eugene Teo <eugeneteo@kernel.sg>
> Subject: [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
> Date: Sat, 13 Sep 2008 11:32:50 -0400
> X-Mailer: git-send-email 1.5.6.1.205.ge2c7.dirty
> 
> Note: some people thinks this represents a security bug, since it
> might make the system go away while it is printing a large number of
> console messages, especially if a serial console is involved.  Hence,
> it has been assigned CVE-2008-3528, but it requires that the attacker
> either has physical access to your machine to insert a USB disk with a
> corrupted filesystem image (at which point why not just hit the power
> button), or is otherwise able to convince the system administrator to
> mount an arbitrary filesystem image (at which point why not just
> include a setuid shell or world-writable hard disk device file or some
> such).  Me, I think they're just being silly.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org
> Cc: Eugene Teo <eugeneteo@kernel.sg>

This patch was purportedly authored by yourself, but I'm going to
assume that it was authored by Eric.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
  2008-09-17 19:25     ` [PATCH 3/4] ext2: " Andrew Morton
@ 2008-09-17 19:30       ` Eric Sandeen
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Sandeen @ 2008-09-17 19:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Ts'o, linux-kernel, linux-ext4, eugeneteo

Andrew Morton wrote:
> On Sat, 13 Sep 2008 11:32:50 -0400
> "Theodore Ts'o" <tytso@MIT.EDU> wrote:
> 
>> From: "Theodore Ts'o" <tytso@MIT.EDU>
>> To: akpm@linux-foundation.org
>> Cc: linux-kernel@vger.kernel.org, "Theodore Ts'o" <tytso@MIT.EDU>, Eric Sandeen <sandeen@redhat.com>, linux-ext4@vger.kernel.org, Eugene Teo <eugeneteo@kernel.sg>
>> Subject: [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
>> Date: Sat, 13 Sep 2008 11:32:50 -0400
>> X-Mailer: git-send-email 1.5.6.1.205.ge2c7.dirty
>>
>> Note: some people thinks this represents a security bug, since it
>> might make the system go away while it is printing a large number of
>> console messages, especially if a serial console is involved.  Hence,
>> it has been assigned CVE-2008-3528, but it requires that the attacker
>> either has physical access to your machine to insert a USB disk with a
>> corrupted filesystem image (at which point why not just hit the power
>> button), or is otherwise able to convince the system administrator to
>> mount an arbitrary filesystem image (at which point why not just
>> include a setuid shell or world-writable hard disk device file or some
>> such).  Me, I think they're just being silly.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
>> Cc: linux-ext4@vger.kernel.org
>> Cc: Eugene Teo <eugeneteo@kernel.sg>
> 
> This patch was purportedly authored by yourself, but I'm going to
> assume that it was authored by Eric.

It was, after some discussion w/ Ted & Andreas.

Also just FWIW I'm also in the "as a security issue this is a bit silly"
camp ;)

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption
  2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
  2008-09-13 15:32     ` [PATCH 4/4] ext3: " Theodore Ts'o
  2008-09-17 19:25     ` [PATCH 3/4] ext2: " Andrew Morton
@ 2008-09-18  9:46     ` Eugene Teo
  2 siblings, 0 replies; 13+ messages in thread
From: Eugene Teo @ 2008-09-18  9:46 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: akpm, linux-kernel, Eric Sandeen, linux-ext4

On Sat, Sep 13, 2008 at 11:32 PM, Theodore Ts'o <tytso@mit.edu> wrote:
[...]
>  Eend:
> -       p = (ext2_dirent *)(kaddr + offs);
> -       ext2_error (sb, "ext2_check_page",
> -               "entry in directory #%lu spans the page boundary"
> -               "offset=%lu, inode=%lu",
> -               dir->i_ino, (page->index<<PAGE_CACHE_SHIFT)+offs,
> -               (unsigned long) le32_to_cpu(p->inode));
> +       if (!quiet) {
> +               p = (ext2_dirent *)(kaddr + offs);
> +               ext2_error (sb, "ext2_check_page",
                                                ^^^^^^^^^^^^^^^^^^
> +                       "entry in directory #%lu spans the page boundary"
> +                       "offset=%lu, inode=%lu",
> +                       dir->i_ino, (page->index<<PAGE_CACHE_SHIFT)+offs,
> +                       (unsigned long) le32_to_cpu(p->inode));

Minor issue. Since you are already changing "ext2_check_page" to
__func__, you might as well change it here too.

Thanks, Eugene

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin
  2008-09-13 15:32 ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Theodore Ts'o
  2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
@ 2008-09-17 19:22   ` Andrew Morton
  2008-09-18  7:03     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2008-09-17 19:22 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: linux-kernel, aneesh.kumar, tytso, linux-ext4, Nick Piggin

On Sat, 13 Sep 2008 11:32:49 -0400
"Theodore Ts'o" <tytso@MIT.EDU> wrote:

> From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> 
> For blocksize < pagesize we need to remove blocks that got allocated in
> block_write_begin() if we fail with ENOSPC for later blocks.
> block_write_begin() internally does this if it allocated page
> locally. This makes sure we don't have blocks outside inode.i_size
> during ENOSPC.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org
> ---
>  fs/ext3/inode.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index 507d868..bff22b9 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -1178,6 +1178,13 @@ write_begin_failed:
>  		ext3_journal_stop(handle);
>  		unlock_page(page);
>  		page_cache_release(page);
> +		/*
> +		 * block_write_begin may have instantiated a few blocks
> +		 * outside i_size.  Trim these off again. Don't need
> +		 * i_size_read because we hold i_mutex.
> +		 */
> +		if (pos + len > inode->i_size)
> +			vmtruncate(inode, inode->i_size);
>  	}
>  	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
>  		goto retry;

Well we used to do this trimming in core VFS, but Nick broke it.  We
still do it if the fs doesn't implement ->write_begin().

Should we do this trimming in pagecache_write_begin() in both cases?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin
  2008-09-17 19:22   ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Andrew Morton
@ 2008-09-18  7:03     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 13+ messages in thread
From: Aneesh Kumar K.V @ 2008-09-18  7:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Ts'o, linux-kernel, linux-ext4, Nick Piggin

On Wed, Sep 17, 2008 at 12:22:54PM -0700, Andrew Morton wrote:
> On Sat, 13 Sep 2008 11:32:49 -0400
> "Theodore Ts'o" <tytso@MIT.EDU> wrote:
> 
> > From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > 
> > For blocksize < pagesize we need to remove blocks that got allocated in
> > block_write_begin() if we fail with ENOSPC for later blocks.
> > block_write_begin() internally does this if it allocated page
> > locally. This makes sure we don't have blocks outside inode.i_size
> > during ENOSPC.
> > 
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> > Cc: linux-ext4@vger.kernel.org
> > ---
> >  fs/ext3/inode.c |    7 +++++++
> >  1 files changed, 7 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> > index 507d868..bff22b9 100644
> > --- a/fs/ext3/inode.c
> > +++ b/fs/ext3/inode.c
> > @@ -1178,6 +1178,13 @@ write_begin_failed:
> >  		ext3_journal_stop(handle);
> >  		unlock_page(page);
> >  		page_cache_release(page);
> > +		/*
> > +		 * block_write_begin may have instantiated a few blocks
> > +		 * outside i_size.  Trim these off again. Don't need
> > +		 * i_size_read because we hold i_mutex.
> > +		 */
> > +		if (pos + len > inode->i_size)
> > +			vmtruncate(inode, inode->i_size);
> >  	}
> >  	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
> >  		goto retry;
> 
> Well we used to do this trimming in core VFS, but Nick broke it.  We
> still do it if the fs doesn't implement ->write_begin().

We still do it in block_write_begin if the pages are allocated by
block_write_begin.
> 
> Should we do this trimming in pagecache_write_begin() in both cases?


pagecache_write_begin is not used in the write_begin call path for
ext3/ext4.
generic_file_buffered_write
	generic_perform_write
		ext3_write_begin
			block_write_begin

-aneesh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling
  2008-09-13 15:32 [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Theodore Ts'o
  2008-09-13 15:32 ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Theodore Ts'o
@ 2008-09-17 19:19 ` Andrew Morton
  2008-10-01 22:37   ` Theodore Tso
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2008-09-17 19:19 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-kernel, tytso, eugene, msnitzer, linux-ext4

On Sat, 13 Sep 2008 11:32:48 -0400
"Theodore Ts'o" <tytso@MIT.EDU> wrote:

> This fixes a bug where readdir() would return a directory entry twice
> if there was a hash collision in an hash tree indexed directory.

That sounds like a serious problem, but given the amount of time it
took to turn up, I guess it's pretty rare.

What are your thoughts regarding a 2.6.27 merge for this?  2.6.26.x? 
2.6.25.x?  ...


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling
  2008-09-17 19:19 ` [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Andrew Morton
@ 2008-10-01 22:37   ` Theodore Tso
  2008-10-01 23:33     ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Theodore Tso @ 2008-10-01 22:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, eugene, msnitzer, linux-ext4

On Wed, Sep 17, 2008 at 12:19:43PM -0700, Andrew Morton wrote:
> On Sat, 13 Sep 2008 11:32:48 -0400
> "Theodore Ts'o" <tytso@MIT.EDU> wrote:
> 
> > This fixes a bug where readdir() would return a directory entry twice
> > if there was a hash collision in an hash tree indexed directory.
> 
> That sounds like a serious problem, but given the amount of time it
> took to turn up, I guess it's pretty rare.
> 
> What are your thoughts regarding a 2.6.27 merge for this?  2.6.26.x? 
> 2.6.25.x?  ...
> 

It's not a regression, so per Linus's request that at this point we're
too late for anything other than regression fixes, I had assumed that
it should be pushed for 2.6.28, and then go into the various stable
trees.  It's true that it took quite a while for people to notice,
probably because many programs won't notice if readdir() returns a
directory entry twice.

BTW, this hasn't hit -mm yet, and I've got a number of ext3 patches
that don't appear to have hit -mm, including one that was authored by
Linus.  Should I create a git tree or a quilt series if that would
make things easier for you?

	      	   	   	      	- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling
  2008-10-01 22:37   ` Theodore Tso
@ 2008-10-01 23:33     ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2008-10-01 23:33 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-kernel, eugene, msnitzer, linux-ext4

On Wed, 1 Oct 2008 18:37:55 -0400
Theodore Tso <tytso@mit.edu> wrote:

> On Wed, Sep 17, 2008 at 12:19:43PM -0700, Andrew Morton wrote:
> > On Sat, 13 Sep 2008 11:32:48 -0400
> > "Theodore Ts'o" <tytso@MIT.EDU> wrote:
> > 
> > > This fixes a bug where readdir() would return a directory entry twice
> > > if there was a hash collision in an hash tree indexed directory.
> > 
> > That sounds like a serious problem, but given the amount of time it
> > took to turn up, I guess it's pretty rare.
> > 
> > What are your thoughts regarding a 2.6.27 merge for this?  2.6.26.x? 
> > 2.6.25.x?  ...
> > 
> 
> It's not a regression, so per Linus's request that at this point we're
> too late for anything other than regression fixes, I had assumed that
> it should be pushed for 2.6.28, and then go into the various stable
> trees.  It's true that it took quite a while for people to notice,
> probably because many programs won't notice if readdir() returns a
> directory entry twice.

OK.

> BTW, this hasn't hit -mm yet, and I've got a number of ext3 patches
> that don't appear to have hit -mm, including one that was authored by
> Linus.  Should I create a git tree or a quilt series if that would
> make things easier for you?
> 

I currently have:

jbd-abort-when-failed-to-log-metadata-buffers.patch
#jbd-fix-error-handling-for-checkpoint-io.patch: double-check this
jbd-fix-error-handling-for-checkpoint-io.patch
ext3-add-checks-for-errors-from-jbd.patch
jbd-dont-dirty-original-metadata-buffer-on-abort.patch
ext3-dont-try-to-resize-if-there-are-no-reserved-gdt-blocks-left.patch
ext3-fix-ext3-block-reservation-early-enospc-issue.patch
jbd-test-bh_write_eio-to-detect-errors-on-metadata-buffers.patch
ext3-add-an-option-to-control-error-handling-on-file-data.patch
jbd-ordered-data-integrity-fix.patch
ext3-fix-ext3_dx_readdir-hash-collision-handling.patch
ext3-fix-ext3_dx_readdir-hash-collision-handling-checkpatch-fixes.patch
ext3-truncate-block-allocated-on-a-failed-ext3_write_begin.patch
ext3-avoid-printk-floods-in-the-face-of-directory-corruption.patch
#jbd-abort-instead-of-waiting-for-nonexistent-transactions.patch: sct probs
jbd-abort-instead-of-waiting-for-nonexistent-transactions.patch

and I don't see anything from yourself in the backlog?

There are some ext3 patches from other people I need to go through -
I've been offlineish for nearly a week and have spent a fun day staring
in horror at the mess which people are shovelling in Stephen's
direction for linux-next.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-10-01 23:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-13 15:32 [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Theodore Ts'o
2008-09-13 15:32 ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Theodore Ts'o
2008-09-13 15:32   ` [PATCH 3/4] ext2: Avoid printk floods in the face of directory corruption Theodore Ts'o
2008-09-13 15:32     ` [PATCH 4/4] ext3: " Theodore Ts'o
2008-09-18  0:57       ` Eugene Teo
2008-09-17 19:25     ` [PATCH 3/4] ext2: " Andrew Morton
2008-09-17 19:30       ` Eric Sandeen
2008-09-18  9:46     ` Eugene Teo
2008-09-17 19:22   ` [PATCH 2/4] ext3: truncate block allocated on a failed ext3_write_begin Andrew Morton
2008-09-18  7:03     ` Aneesh Kumar K.V
2008-09-17 19:19 ` [PATCH 1/4] ext3: Fix ext3_dx_readdir hash collision handling Andrew Morton
2008-10-01 22:37   ` Theodore Tso
2008-10-01 23:33     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).