Intentionally corrupted ext4s causing two different kernel panics at umount

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Intentionally corrupted ext4s causing two different kernel panics at umount
@ 2014-10-05  0:12 Sami Liedes
  2014-10-06  2:48 ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Theodore Ts'o
  2014-10-07 20:56 ` One more corrupted fs crash in ext4_put_super Sami Liedes
  0 siblings, 2 replies; 15+ messages in thread
From: Sami Liedes @ 2014-10-05  0:12 UTC (permalink / raw)
  To: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 12462 bytes --]

Hi!

I ran some fuzz tests on an ext4 filesystem on 3.16.3 and on 3.17-rc7
and found some filesystems that differ from a pristine filesystem by
one bit and cause a kernel panic at unmount time.

The set of operations I run for each filesystem is this:

   mount $TARGET_DEV /mnt -t $FSTYPE -o errors=continue
   cd /mnt
   timeout 30 cp -r doc doc2 >&/dev/null
   timeout 30 find -xdev >&/dev/null
   timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- >&/dev/null
   timeout 30 mkdir tmp >&/dev/null
   timeout 30 echo whoah >tmp/filu >&/dev/null
   timeout 30 rm -rf /mnt/* >&/dev/null
   cd /
   umount /mnt

I got two distinct backtraces, and for both of them I have two test
images that differ from a clean ext4 filesystem by a single bit.

You can get the pristine filesystem from

   http://www.niksula.hut.fi/~sliedes/ext4/testimg.ext4.pristine.bz2

For the rest of the files, see

   http://www.niksula.hut.fi/~sliedes/ext4/


1. Crash in ext4_put_super
==========================

Test filesystems and diffs to the pristine image:

   http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.20942.min.bz2

--- /dev/fd/63  2014-10-05 02:22:36.822155073 +0300
+++ /dev/fd/62  2014-10-05 02:22:36.822155073 +0300
@@ -32572,7 +32572,7 @@
 001795a0  2d 70 63 73 70 6b 72 2d  65 76 65 6e 74 2d 73 70  |-pcspkr-event-sp|
 001795b0  6b 72 0c 00 e1 01 00 00  20 00 18 02 62 75 73 5c  |kr...... ...bus\|
 001795c0  78 32 66 75 73 62 5c 78  32 66 30 30 38 5c 78 32  |x2fusb\x2f008\x2|
-001795d0  66 30 30 31 05 02 00 00  18 00 0e 02 75 73 62 64  |f001........usbd|
+001795d0  66 30 30 31 05 00 00 00  18 00 0e 02 75 73 62 64  |f001........usbd|
 001795e0  65 76 37 2e 31 5f 65 70  38 31 10 00 1f 02 00 00  |ev7.1_ep81......|
 001795f0  18 00 0e 02 75 73 62 64  65 76 31 2e 31 5f 65 70  |....usbdev1.1_ep|
 00179600  30 30 04 02 25 02 00 00  18 00 0e 02 75 73 62 64  |00..%.......usbd|


   http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.106360.min.bz2

--- /dev/fd/63  2014-10-05 02:22:36.501155217 +0300
+++ /dev/fd/62  2014-10-05 02:22:36.501155217 +0300
@@ -36271,7 +36271,7 @@
 *
 001b8400  03 04 00 00 0c 00 01 02  2e 00 00 00 0c 00 00 00  |................|
 001b8410  0c 00 02 02 2e 2e 00 00  04 04 00 00 0c 00 04 04  |................|
-001b8420  73 64 65 33 05 04 00 00  14 00 0c 04 72 6f 6f 74  |sde3........root|
+001b8420  73 64 65 33 05 00 00 00  14 00 0c 04 72 6f 6f 74  |sde3........root|
 001b8430  2d 63 72 79 70 74 65 64  06 04 00 00 24 00 1b 04  |-crypted....$...|
 001b8440  6c 76 6d 32 7c 6d 79 5f  63 6f 6e 74 61 69 6e 65  |lvm2|my_containe|
 001b8450  72 7c 6d 79 5f 72 65 67  69 6f 6e 00 07 04 00 00  |r|my_region.....|


The backtrace, trimmed from

   http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.20942.min.log

[    1.034753] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=continue
[    1.353376] EXT4-fs warning (device vdb): ext4_unlink:2820: Deleting nonexistent file (5), 0
[    1.354480] EXT4-fs (vdb): Inode 5 (ffff8800048a0e10): orphan list check failed!
[    1.355433] ffff8800048a0e10: 00000000 00000000 00000000 00000000  ................
[...]
[    1.437175] ffff8800048a1500: 00000081 0000007f 00000000 00000000  ................
[    1.437769] CPU: 0 PID: 207 Comm: rm Not tainted 3.16.3 #3
[    1.438195] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    1.438979]  ffff8800048a0e10 ffff880000647dd0 ffffffff81850b5c ffff8800048a0f80
[    1.439592]  ffff880000647e00 ffffffff812615bd 0000000000000700 ffff880000000001
[    1.440217]  ffff8800048a0f80 ffff8800048a1000 ffff880000647e18 ffffffff8116d723
[    1.440837] Call Trace:
[    1.441035]  [<ffffffff81850b5c>] dump_stack+0x45/0x56
[    1.441437]  [<ffffffff812615bd>] ext4_destroy_inode+0x9d/0xa0
[    1.441894]  [<ffffffff8116d723>] destroy_inode+0x33/0x70
[    1.442313]  [<ffffffff8116dd72>] evict+0x112/0x1a0
[    1.442696]  [<ffffffff8116eacd>] iput+0xed/0x190
[    1.443063]  [<ffffffff81162cd7>] do_unlinkat+0x197/0x2c0
[    1.443484]  [<ffffffff81063485>] ? sys32_fstatat+0x15/0x30
[    1.443920]  [<ffffffff81162e16>] SyS_unlinkat+0x16/0x40
[    1.444343]  [<ffffffff81859aa8>] sysenter_dispatch+0x7/0x25
[    1.447553] tsc: Refined TSC clocksource calibration: 3400.019 MHz
[    1.455218] EXT4-fs warning (device vdb): ext4_rmdir:2760: empty directory has too many links (3)
[    1.570473] EXT4-fs (vdb): sb orphan head is 5
[    1.571220] sb_info orphan list:
[    1.571645]   inode vdb:5 at ffff8800048a0f80: mode 100000, nlink 0, next 0
[    1.572569] ------------[ cut here ]------------
[    1.573168] kernel BUG at fs/ext4/super.c:836!
[    1.573745] invalid opcode: 0000 [#1] SMP
[    1.574308] CPU: 0 PID: 209 Comm: umount Not tainted 3.16.3 #3
[    1.575060] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    1.576354] task: ffff880005e5c100 ti: ffff880005e34000 task.ti: ffff880005e34000
[    1.576549] RIP: 0010:[<ffffffff81261516>]  [<ffffffff81261516>] ext4_put_super+0x366/0x370
[    1.576549] RSP: 0018:ffff880005e37e70  EFLAGS: 00010202
[    1.576549] RAX: 000000000000003f RBX: ffff880005e31800 RCX: 0000000000000006
[    1.576549] RDX: 0000000000000007 RSI: 0000000000000001 RDI: 0000000000000246
[    1.576549] RBP: ffff880005e37ea0 R08: 0000000000000001 R09: 0000000000000000
[    1.576549] R10: 0000000000000000 R11: 0000000000000219 R12: ffff880005e31b28
[    1.576549] R13: ffff880005e31000 R14: ffff880005e31a88 R15: ffff880005e31b28
[    1.576549] FS:  0000000000000000(0000) GS:ffff880007c00000(0063) knlGS:00000000f746a780
[    1.576549] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[    1.576549] CR2: 0000000008d05014 CR3: 0000000005c2b000 CR4: 00000000000006b0
[    1.576549] Stack:
[    1.576549]  ffff880000000000 ffff880005e31000 ffff880005e310f8 ffffffff81a32840
[    1.576549]  0000000000000000 0000000000000000 ffff880005e37ec8 ffffffff811547dd
[    1.576549]  0000000000000083 ffff880006c0e100 0000000000000000 ffff880005e37ee8
[    1.576549] Call Trace:
[    1.576549]  [<ffffffff811547dd>] generic_shutdown_super+0x6d/0xf0
[    1.576549]  [<ffffffff81155a12>] kill_block_super+0x22/0x70
[    1.576549]  [<ffffffff811544fc>] deactivate_locked_super+0x3c/0x60
[    1.576549]  [<ffffffff8115457c>] deactivate_super+0x5c/0x60
[    1.576549]  [<ffffffff811728c1>] mntput_no_expire+0x171/0x260
[    1.576549]  [<ffffffff811744aa>] ? SyS_oldumount+0x7a/0xe0
[    1.576549]  [<ffffffff811744aa>] SyS_oldumount+0x7a/0xe0
[    1.576549]  [<ffffffff81859aa8>] sysenter_dispatch+0x7/0x25
[    1.576549] Code: b0 90 05 00 00 41 8b 87 64 ff ff ff 89 04 24 31 c0 e8 ab c1 5e 00 4d 8b 3f 4d 39 fc 75 b5 4c 3b a3 28 03 00 00 0f 84 af fe ff ff <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 4c 8d a7 90 fe
[    1.576549] RIP  [<ffffffff81261516>] ext4_put_super+0x366/0x370
[    1.576549]  RSP <ffff880005e37e70>
[    1.596184] ---[ end trace e2c3a1b45e3598c1 ]---
[    1.596551] Kernel panic - not syncing: Fatal exception
[    1.597076] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    1.597870] Rebooting in 1 seconds..


2. Crash in start_this_handle
=============================

Test filesystems and diffs to the pristine image:

   http://www.niksula.hut.fi/~sliedes/ext4/start_this_handle/testimg.ext4.8473.min.bz2

--- /dev/fd/63  2014-10-05 02:22:37.396154814 +0300
+++ /dev/fd/62  2014-10-05 02:22:37.395154815 +0300
@@ -164,7 +164,7 @@
 *
 0000b000  02 00 00 00 0c 00 01 02  2e 00 00 00 02 00 00 00  |................|
 0000b010  0c 00 02 02 2e 2e 00 00  0b 00 00 00 14 00 0a 02  |................|
-0000b020  6c 6f 73 74 2b 66 6f 75  6e 64 00 00 0c 00 00 00  |lost+found......|
+0000b020  6c 6f 73 74 2b 66 6f 75  6e 64 00 00 08 00 00 00  |lost+found......|
 0000b030  0c 00 03 02 64 65 76 00  ff 04 00 00 c8 03 03 02  |....dev.........|
 0000b040  64 6f 63 00 00 00 00 00  00 00 00 00 00 00 00 00  |doc.............|
 0000b050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|


   http://www.niksula.hut.fi/~sliedes/ext4/start_this_handle/testimg.ext4.610085.min.bz2

--- /dev/fd/63  2014-10-05 02:22:37.100154947 +0300
+++ /dev/fd/62  2014-10-05 02:22:37.100154947 +0300
@@ -36276,7 +36276,7 @@
 001b8440  6c 76 6d 32 7c 6d 79 5f  63 6f 6e 74 61 69 6e 65  |lvm2|my_containe|
 001b8450  72 7c 6d 79 5f 72 65 67  69 6f 6e 00 07 04 00 00  |r|my_region.....|
 001b8460  18 00 0f 04 6d 79 76 67  2d 72 6f 6f 74 5f 63 72  |....myvg-root_cr|
-001b8470  79 70 74 00 08 04 00 00  28 00 1f 04 6c 76 6d 32  |ypt.....(...lvm2|
+001b8470  79 70 74 00 08 00 00 00  28 00 1f 04 6c 76 6d 32  |ypt.....(...lvm2|
 001b8480  7c 6d 79 5f 63 6f 6e 74  61 69 6e 65 72 7c 73 77  ||my_container|sw|
 001b8490  61 70 30 2d 63 72 79 70  74 65 64 00 09 04 00 00  |ap0-crypted.....|
 001b84a0  0c 00 04 04 73 64 64 32  0a 04 00 00 14 00 09 04  |....sdd2........|


The backtrace, trimmed from

   http://www.niksula.hut.fi/~sliedes/ext4/start_this_handle/testimg.ext4.8473.min.log

[    1.025503] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=continue
[    1.275936] ------------[ cut here ]------------
[    1.276860] kernel BUG at fs/jbd2/transaction.c:307!
[    1.277789] invalid opcode: 0000 [#1] SMP
[    1.278622] CPU: 0 PID: 208 Comm: umount Not tainted 3.16.3 #3
[    1.279721] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    1.279862] task: ffff880005db5140 ti: ffff88000042c000 task.ti: ffff88000042c000
[    1.279862] RIP: 0010:[<ffffffff81293e60>]  [<ffffffff81293e60>] start_this_handle+0x330/0x760
[    1.279862] RSP: 0018:ffff88000042fc60  EFLAGS: 00010202
[    1.279862] RAX: 0000000000000039 RBX: ffff880005e06828 RCX: 0000000000000002
[    1.279862] RDX: 000000000000000a RSI: 0000000000000001 RDI: ffff880005e06828
[    1.279862] RBP: ffff88000042fd00 R08: 0000000000000000 R09: 0000000000000000
[    1.279862] R10: ffff880005e06840 R11: 0000000000000002 R12: ffff880005e06800
[    1.279862] R13: ffff8800067fc000 R14: ffff880005e06800 R15: 0000000000000000
[    1.279862] FS:  0000000000000000(0000) GS:ffff880007c00000(0063) knlGS:00000000f7424780
[    1.279862] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[    1.279862] CR2: 0000000009ae8014 CR3: 0000000005d53000 CR4: 00000000000006b0
[    1.279862] Stack:
[    1.279862]  0000000000000286 ffff880005db5810 ffff8800049102b9 ffff880005e06df8
[    1.279862]  0000000000000000 00000000fffedc46 ffff88000042fcc8 ffff8800067f9000
[    1.279862]  0000005b00000050 ffffffff0000005b ffffffff81293a1b ffff8800067fc000
[    1.279862] Call Trace:
[    1.279862]  [<ffffffff81293a1b>] ? new_handle+0x1b/0x50
[    1.279862]  [<ffffffff8129451b>] jbd2__journal_start+0xcb/0x1a0
[    1.279862]  [<ffffffff8124a45d>] ? ext4_evict_inode+0x17d/0x500
[    1.279862]  [<ffffffff81272635>] __ext4_journal_start_sb+0x65/0xd0
[    1.279862]  [<ffffffff8124a45d>] ext4_evict_inode+0x17d/0x500
[    1.279862]  [<ffffffff8116dd0f>] evict+0xaf/0x1a0
[    1.279862]  [<ffffffff8116eacd>] iput+0xed/0x190
[    1.279862]  [<ffffffff8129f418>] jbd2_journal_destroy+0x1a8/0x240
[    1.279862]  [<ffffffff810a7710>] ? __wake_up_common+0x90/0x90
[    1.279862]  [<ffffffff8126120f>] ext4_put_super+0x5f/0x370
[    1.279862]  [<ffffffff811547dd>] generic_shutdown_super+0x6d/0xf0
[    1.279862]  [<ffffffff81155a12>] kill_block_super+0x22/0x70
[    1.279862]  [<ffffffff811544fc>] deactivate_locked_super+0x3c/0x60
[    1.279862]  [<ffffffff8115457c>] deactivate_super+0x5c/0x60
[    1.279862]  [<ffffffff811728c1>] mntput_no_expire+0x171/0x260
[    1.279862]  [<ffffffff811744aa>] ? SyS_oldumount+0x7a/0xe0
[    1.279862]  [<ffffffff811744aa>] SyS_oldumount+0x7a/0xe0
[    1.279862]  [<ffffffff81859aa8>] sysenter_dispatch+0x7/0x25
[    1.279862] Code: 1f 40 00 8b 45 a8 3e 29 82 cc 00 00 00 4c 89 e7 e8 06 fc ff ff 48 89 df e8 fe 32 5c 00 49 8b 04 24 a8 01 0f 84 a7 fd ff ff 66 90 <0f> 0b 66 0f 1f 44 00 00 8b 45 a8 3e 41 29 00 48 89 df e8 19 34
[    1.279862] RIP  [<ffffffff81293e60>] start_this_handle+0x330/0x760
[    1.279862]  RSP <ffff88000042fc60>
[    1.301916] ---[ end trace 52c6387c01b65be9 ]---
[    1.302279] Kernel panic - not syncing: Fatal exception
[    1.302792] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    1.303577] Rebooting in 1 seconds..

	Sami

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode
  2014-10-05  0:12 Intentionally corrupted ext4s causing two different kernel panics at umount Sami Liedes
@ 2014-10-06  2:48 ` Theodore Ts'o
  2014-10-06  2:48   ` [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups Theodore Ts'o
  2014-10-06 15:06   ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Jan Kara
  2014-10-07 20:56 ` One more corrupted fs crash in ext4_put_super Sami Liedes
  1 sibling, 2 replies; 15+ messages in thread
From: Theodore Ts'o @ 2014-10-06  2:48 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

The boot loader inode (inode #5) should never be visible in the
directory hierarchy, but it's possible if the file system is corrupted
that there will be a directory entry that points at inode #5.  In
order to avoid accidentally trashing it, when such a directory inode
is opened, the inode will be marked as a bad inode, so that it's not
possible to modify (or read) the inode from userspace.

Unfortunately, when we unlink this (invalid/illegal) directory entry,
we will put the bad inode on the ophan list, and then when try to
unlink the directory, we don't actually remove the bad inode from the
orphan list before freeing in-memory inode structure.  This means the
in-memory orphan list is corrupted, leading to a kernel oops.

In addition, avoid truncating a bad inode in ext4_destroy_inode(),
since truncating the boot loader inode is not a smart thing to do.

Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/inode.c | 7 +++----
 fs/ext4/namei.c | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 41c4f97..59983b2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -224,16 +224,15 @@ void ext4_evict_inode(struct inode *inode)
 		goto no_delete;
 	}
 
-	if (!is_bad_inode(inode))
-		dquot_initialize(inode);
+	if (is_bad_inode(inode))
+		goto no_delete;
+	dquot_initialize(inode);
 
 	if (ext4_should_order_data(inode))
 		ext4_begin_ordered_truncate(inode, 0);
 	truncate_inode_pages_final(&inode->i_data);
 
 	WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count));
-	if (is_bad_inode(inode))
-		goto no_delete;
 
 	/*
 	 * Protect us against freezing - iput() caller didn't have to have any
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 51705f8..a2a9d40 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2544,7 +2544,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
 	int err = 0, rc;
 	bool dirty = false;
 
-	if (!sbi->s_journal)
+	if (!sbi->s_journal || is_bad_inode(inode))
 		return 0;
 
 	WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups
  2014-10-06  2:48 ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Theodore Ts'o
@ 2014-10-06  2:48   ` Theodore Ts'o
  2014-10-06  2:52     ` Andreas Dilger
  2014-10-06 15:09     ` Jan Kara
  2014-10-06 15:06   ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Jan Kara
  1 sibling, 2 replies; 15+ messages in thread
From: Theodore Ts'o @ 2014-10-06  2:48 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

If there is a corrupted file system which has directory entries that
point at reserved, metadata inodes, prohibit them from being used by
treating them the same way we treat Boot Loader inodes --- that is,
mark them to be bad inodes.  This prohibits them from being opened,
deleted, or modified via chmod, chown, utimes, etc.

In particular, this prevents a corrupted file system which has a
directory entry which points at the journal inode from being deleted
and being released, after which point Much Hilarity Ensues.

Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/inode.c | 10 ++++++++++
 fs/ext4/namei.c |  4 ++--
 fs/ext4/super.c |  2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1eb5b7b..012e89b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2109,6 +2109,7 @@ int do_journal_get_write_access(handle_t *handle,
 #define CONVERT_INLINE_DATA	 2
 
 extern struct inode *ext4_iget(struct super_block *, unsigned long);
+extern struct inode *ext4_iget_normal(struct super_block *, unsigned long);
 extern int  ext4_write_inode(struct inode *, struct writeback_control *);
 extern int  ext4_setattr(struct dentry *, struct iattr *);
 extern int  ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 59983b2..437622c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4104,6 +4104,16 @@ bad_inode:
 	return ERR_PTR(ret);
 }
 
+struct inode *ext4_iget_normal(struct super_block *sb, unsigned long ino)
+{
+	struct inode *ret_inode = ext4_iget(sb, ino);
+
+	if (ret_inode && !IS_ERR(ret_inode) &&
+	    ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)
+		make_bad_inode(ret_inode);
+	return ret_inode;
+}
+
 static int ext4_inode_blocks_set(handle_t *handle,
 				struct ext4_inode *raw_inode,
 				struct ext4_inode_info *ei)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index a2a9d40..7037ecf 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1417,7 +1417,7 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, unsi
 					 dentry);
 			return ERR_PTR(-EIO);
 		}
-		inode = ext4_iget(dir->i_sb, ino);
+		inode = ext4_iget_normal(dir->i_sb, ino);
 		if (inode == ERR_PTR(-ESTALE)) {
 			EXT4_ERROR_INODE(dir,
 					 "deleted inode referenced: %u",
@@ -1450,7 +1450,7 @@ struct dentry *ext4_get_parent(struct dentry *child)
 		return ERR_PTR(-EIO);
 	}
 
-	return d_obtain_alias(ext4_iget(child->d_inode->i_sb, ino));
+	return d_obtain_alias(ext4_iget_normal(child->d_inode->i_sb, ino));
 }
 
 /*
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1070d6e..a0811cc 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1001,7 +1001,7 @@ static struct inode *ext4_nfs_get_inode(struct super_block *sb,
 	 * Currently we don't know the generation for parent directory, so
 	 * a generation of 0 means "accept any"
 	 */
-	inode = ext4_iget(sb, ino);
+	inode = ext4_iget_normal(sb, ino);
 	if (IS_ERR(inode))
 		return ERR_CAST(inode);
 	if (generation && inode->i_generation != generation) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups
  2014-10-06  2:48   ` [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups Theodore Ts'o
@ 2014-10-06  2:52     ` Andreas Dilger
  2014-10-06  3:16       ` Theodore Ts'o
  2014-10-06 15:09     ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Andreas Dilger @ 2014-10-06  2:52 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List

[-- Attachment #1: Type: text/plain, Size: 3941 bytes --]

On Oct 5, 2014, at 8:48 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> If there is a corrupted file system which has directory entries that
> point at reserved, metadata inodes, prohibit them from being used by
> treating them the same way we treat Boot Loader inodes --- that is,
> mark them to be bad inodes.  This prohibits them from being opened,
> deleted, or modified via chmod, chown, utimes, etc.
> 
> In particular, this prevents a corrupted file system which has a
> directory entry which points at the journal inode from being deleted
> and being released, after which point Much Hilarity Ensues.

Wouldn't it be safer to change "ext4_iget()" to have these checks, 
and add an "ext4_iget_special()" or "ext4_iget_reserved()" for use
in the few places that are opening reserved inodes?  That would
probably be safer for the future.

Cheers, Andreas

> Reported-by: Sami Liedes <sami.liedes@iki.fi>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
> fs/ext4/ext4.h  |  1 +
> fs/ext4/inode.c | 10 ++++++++++
> fs/ext4/namei.c |  4 ++--
> fs/ext4/super.c |  2 +-
> 4 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1eb5b7b..012e89b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2109,6 +2109,7 @@ int do_journal_get_write_access(handle_t *handle,
> #define CONVERT_INLINE_DATA	 2
> 
> extern struct inode *ext4_iget(struct super_block *, unsigned long);
> +extern struct inode *ext4_iget_normal(struct super_block *, unsigned long);
> extern int  ext4_write_inode(struct inode *, struct writeback_control *);
> extern int  ext4_setattr(struct dentry *, struct iattr *);
> extern int  ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 59983b2..437622c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4104,6 +4104,16 @@ bad_inode:
> 	return ERR_PTR(ret);
> }
> 
> +struct inode *ext4_iget_normal(struct super_block *sb, unsigned long ino)
> +{
> +	struct inode *ret_inode = ext4_iget(sb, ino);
> +
> +	if (ret_inode && !IS_ERR(ret_inode) &&
> +	    ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)
> +		make_bad_inode(ret_inode);
> +	return ret_inode;
> +}
> +
> static int ext4_inode_blocks_set(handle_t *handle,
> 				struct ext4_inode *raw_inode,
> 				struct ext4_inode_info *ei)
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index a2a9d40..7037ecf 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1417,7 +1417,7 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, unsi
> 					 dentry);
> 			return ERR_PTR(-EIO);
> 		}
> -		inode = ext4_iget(dir->i_sb, ino);
> +		inode = ext4_iget_normal(dir->i_sb, ino);
> 		if (inode == ERR_PTR(-ESTALE)) {
> 			EXT4_ERROR_INODE(dir,
> 					 "deleted inode referenced: %u",
> @@ -1450,7 +1450,7 @@ struct dentry *ext4_get_parent(struct dentry *child)
> 		return ERR_PTR(-EIO);
> 	}
> 
> -	return d_obtain_alias(ext4_iget(child->d_inode->i_sb, ino));
> +	return d_obtain_alias(ext4_iget_normal(child->d_inode->i_sb, ino));
> }
> 
> /*
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 1070d6e..a0811cc 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1001,7 +1001,7 @@ static struct inode *ext4_nfs_get_inode(struct super_block *sb,
> 	 * Currently we don't know the generation for parent directory, so
> 	 * a generation of 0 means "accept any"
> 	 */
> -	inode = ext4_iget(sb, ino);
> +	inode = ext4_iget_normal(sb, ino);
> 	if (IS_ERR(inode))
> 		return ERR_CAST(inode);
> 	if (generation && inode->i_generation != generation) {
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups
  2014-10-06  2:52     ` Andreas Dilger
@ 2014-10-06  3:16       ` Theodore Ts'o
  0 siblings, 0 replies; 15+ messages in thread
From: Theodore Ts'o @ 2014-10-06  3:16 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Ext4 Developers List

On Sun, Oct 05, 2014 at 08:52:38PM -0600, Andreas Dilger wrote:
> On Oct 5, 2014, at 8:48 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> > If there is a corrupted file system which has directory entries that
> > point at reserved, metadata inodes, prohibit them from being used by
> > treating them the same way we treat Boot Loader inodes --- that is,
> > mark them to be bad inodes.  This prohibits them from being opened,
> > deleted, or modified via chmod, chown, utimes, etc.
> > 
> > In particular, this prevents a corrupted file system which has a
> > directory entry which points at the journal inode from being deleted
> > and being released, after which point Much Hilarity Ensues.
> 
> Wouldn't it be safer to change "ext4_iget()" to have these checks, 
> and add an "ext4_iget_special()" or "ext4_iget_reserved()" for use
> in the few places that are opening reserved inodes?  That would
> probably be safer for the future.

There is actually much larger set of places where we iget reserved
inodes -- in fact, double the he number of places where we return
inodes back up to the VFS --- 3 for the latter, and 6 for the former.

As for future additions, it's much more likely that we would be adding
new code paths to read reserved inodes.  New VFS functionality tends
to go through the dcache layer, so I don't see the likelihood of
needing to add a new call to ext4_iget_normal() any time soon.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups
  2014-10-06  2:48   ` [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups Theodore Ts'o
  2014-10-06  2:52     ` Andreas Dilger
@ 2014-10-06 15:09     ` Jan Kara
  2014-10-06 18:55       ` Theodore Ts'o
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kara @ 2014-10-06 15:09 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List

On Sun 05-10-14 22:48:02, Ted Tso wrote:
> If there is a corrupted file system which has directory entries that
> point at reserved, metadata inodes, prohibit them from being used by
> treating them the same way we treat Boot Loader inodes --- that is,
> mark them to be bad inodes.  This prohibits them from being opened,
> deleted, or modified via chmod, chown, utimes, etc.
> 
> In particular, this prevents a corrupted file system which has a
> directory entry which points at the journal inode from being deleted
> and being released, after which point Much Hilarity Ensues.
> 
> Reported-by: Sami Liedes <sami.liedes@iki.fi>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>  fs/ext4/ext4.h  |  1 +
>  fs/ext4/inode.c | 10 ++++++++++
>  fs/ext4/namei.c |  4 ++--
>  fs/ext4/super.c |  2 +-
>  4 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1eb5b7b..012e89b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2109,6 +2109,7 @@ int do_journal_get_write_access(handle_t *handle,
>  #define CONVERT_INLINE_DATA	 2
>  
>  extern struct inode *ext4_iget(struct super_block *, unsigned long);
> +extern struct inode *ext4_iget_normal(struct super_block *, unsigned long);
>  extern int  ext4_write_inode(struct inode *, struct writeback_control *);
>  extern int  ext4_setattr(struct dentry *, struct iattr *);
>  extern int  ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 59983b2..437622c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4104,6 +4104,16 @@ bad_inode:
>  	return ERR_PTR(ret);
>  }
>  
> +struct inode *ext4_iget_normal(struct super_block *sb, unsigned long ino)
> +{
> +	struct inode *ret_inode = ext4_iget(sb, ino);
> +
> +	if (ret_inode && !IS_ERR(ret_inode) &&
> +	    ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)
> +		make_bad_inode(ret_inode);
> +	return ret_inode;
  Hum, why don't we just return an error (like EIO) when invalid inode
number is passed?

								Honza
> +}
> +
>  static int ext4_inode_blocks_set(handle_t *handle,
>  				struct ext4_inode *raw_inode,
>  				struct ext4_inode_info *ei)
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index a2a9d40..7037ecf 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1417,7 +1417,7 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, unsi
>  					 dentry);
>  			return ERR_PTR(-EIO);
>  		}
> -		inode = ext4_iget(dir->i_sb, ino);
> +		inode = ext4_iget_normal(dir->i_sb, ino);
>  		if (inode == ERR_PTR(-ESTALE)) {
>  			EXT4_ERROR_INODE(dir,
>  					 "deleted inode referenced: %u",
> @@ -1450,7 +1450,7 @@ struct dentry *ext4_get_parent(struct dentry *child)
>  		return ERR_PTR(-EIO);
>  	}
>  
> -	return d_obtain_alias(ext4_iget(child->d_inode->i_sb, ino));
> +	return d_obtain_alias(ext4_iget_normal(child->d_inode->i_sb, ino));
>  }
>  
>  /*
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 1070d6e..a0811cc 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1001,7 +1001,7 @@ static struct inode *ext4_nfs_get_inode(struct super_block *sb,
>  	 * Currently we don't know the generation for parent directory, so
>  	 * a generation of 0 means "accept any"
>  	 */
> -	inode = ext4_iget(sb, ino);
> +	inode = ext4_iget_normal(sb, ino);
>  	if (IS_ERR(inode))
>  		return ERR_CAST(inode);
>  	if (generation && inode->i_generation != generation) {
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups
  2014-10-06 15:09     ` Jan Kara
@ 2014-10-06 18:55       ` Theodore Ts'o
  0 siblings, 0 replies; 15+ messages in thread
From: Theodore Ts'o @ 2014-10-06 18:55 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ext4 Developers List

On Mon, Oct 06, 2014 at 05:09:03PM +0200, Jan Kara wrote:
> > +	if (ret_inode && !IS_ERR(ret_inode) &&
> > +	    ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)
> > +		make_bad_inode(ret_inode);
> > +	return ret_inode;
>   Hum, why don't we just return an error (like EIO) when invalid inode
> number is passed?

Yeah, I guess we can do that.  We need to support the make_bad_inode()
for the sake of EXT4_IOC_SWAP_BOOT.  But that code path doesn't need
to use ext4_iget_normal().  So yeah, in the case of
ext4_iget_normal(), we should be able to just return -EIO and let the
userspace fail fast with the open(2) instead of later on with the
read(2) or write(2) or truncate(2) call.

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode
  2014-10-06  2:48 ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Theodore Ts'o
  2014-10-06  2:48   ` [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups Theodore Ts'o
@ 2014-10-06 15:06   ` Jan Kara
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kara @ 2014-10-06 15:06 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Sun 05-10-14 22:48:01, Ted Tso wrote:
> The boot loader inode (inode #5) should never be visible in the
> directory hierarchy, but it's possible if the file system is corrupted
> that there will be a directory entry that points at inode #5.  In
> order to avoid accidentally trashing it, when such a directory inode
> is opened, the inode will be marked as a bad inode, so that it's not
> possible to modify (or read) the inode from userspace.
> 
> Unfortunately, when we unlink this (invalid/illegal) directory entry,
> we will put the bad inode on the ophan list, and then when try to
> unlink the directory, we don't actually remove the bad inode from the
> orphan list before freeing in-memory inode structure.  This means the
> in-memory orphan list is corrupted, leading to a kernel oops.
> 
> In addition, avoid truncating a bad inode in ext4_destroy_inode(),
> since truncating the boot loader inode is not a smart thing to do.
> 
> Reported-by: Sami Liedes <sami.liedes@iki.fi>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org
  The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 7 +++----
>  fs/ext4/namei.c | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 41c4f97..59983b2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -224,16 +224,15 @@ void ext4_evict_inode(struct inode *inode)
>  		goto no_delete;
>  	}
>  
> -	if (!is_bad_inode(inode))
> -		dquot_initialize(inode);
> +	if (is_bad_inode(inode))
> +		goto no_delete;
> +	dquot_initialize(inode);
>  
>  	if (ext4_should_order_data(inode))
>  		ext4_begin_ordered_truncate(inode, 0);
>  	truncate_inode_pages_final(&inode->i_data);
>  
>  	WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count));
> -	if (is_bad_inode(inode))
> -		goto no_delete;
>  
>  	/*
>  	 * Protect us against freezing - iput() caller didn't have to have any
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 51705f8..a2a9d40 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -2544,7 +2544,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
>  	int err = 0, rc;
>  	bool dirty = false;
>  
> -	if (!sbi->s_journal)
> +	if (!sbi->s_journal || is_bad_inode(inode))
>  		return 0;
>  
>  	WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* One more corrupted fs crash in ext4_put_super
  2014-10-05  0:12 Intentionally corrupted ext4s causing two different kernel panics at umount Sami Liedes
  2014-10-06  2:48 ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Theodore Ts'o
@ 2014-10-07 20:56 ` Sami Liedes
  2014-10-07 21:57   ` Darrick J. Wong
  2014-10-09 20:15   ` Sami Liedes
  1 sibling, 2 replies; 15+ messages in thread
From: Sami Liedes @ 2014-10-07 20:56 UTC (permalink / raw)
  To: linux-ext4; +Cc: Theodore Ts'o

[-- Attachment #1: Type: text/plain, Size: 4187 bytes --]

Hi,

Here's one more filesystem that causes a crash in ext4_put_super on
3.17 both with and without the two patches from this thread applied.

Interestingly this one does not seem to crash on 3.16.4, with or
without the patches from this thread. Even on 3.17 I *think* I've seen
it not crash, but the reproducibility seems to be well over 95%.

Crashing image:

  http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.112041.min.bz2

Pristine image:

  http://www.niksula.hut.fi/~sliedes/ext4/testimg.ext4.pristine.bz2

Diff:

--- /dev/fd/63  2014-10-07 23:52:33.397018880 +0300
+++ /dev/fd/62  2014-10-07 23:52:33.398018880 +0300
@@ -36771,7 +36771,7 @@
 001bd040  65 76 65 6e 74 30 00 00  b8 04 00 00 10 00 05 02  |event0..........|
 001bd050  62 79 2d 69 64 00 00 00  bc 04 00 00 10 00 07 02  |by-id...........|
 001bd060  62 79 2d 70 61 74 68 00  c2 04 00 00 10 00 06 03  |by-path.........|
-001bd070  65 76 65 6e 74 35 00 00  c3 04 00 00 0c 00 04 03  |event5..........|
+001bd070  65 76 65 6e 74 35 00 00  c3 00 00 00 0c 00 04 03  |event5..........|
 001bd080  6d 69 63 65 c4 04 00 00  10 00 06 03 65 76 65 6e  |mice........even|
 001bd090  74 32 00 00 c5 04 00 00  10 00 06 03 65 76 65 6e  |t2..........even|
 001bd0a0  74 33 00 00 c6 04 00 00  5c 03 06 03 65 76 65 6e  |t3......\...even|

Backtrace:

[    1.936509] EXT4-fs (vdb): sb orphan head is 195
[    1.936889] sb_info orphan list:
[    1.937145]   inode vdb:195 at ffff880006675d90: mode 40755, nlink 0, next 0
[    1.937699] ------------[ cut here ]------------
[    1.938057] kernel BUG at fs/ext4/super.c:836!
[    1.938419] invalid opcode: 0000 [#1] SMP
[    1.938788] CPU: 0 PID: 1041 Comm: umount Not tainted 3.17.0+ #32
[    1.939278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    1.940059] task: ffff8800060bd2d0 ti: ffff88000639c000 task.ti: ffff88000639c000
[    1.940299] RIP: 0010:[<ffffffff812753e6>]  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
[    1.940299] RSP: 0018:ffff88000639fe70  EFLAGS: 00010287
[    1.940299] RAX: 0000000000000040 RBX: ffff8800063b6800 RCX: 0000000000006665
[    1.940299] RDX: 0000000000000040 RSI: 0000000000000001 RDI: 0000000000000286
[    1.940299] RBP: ffff88000639fea0 R08: 0000000000000001 R09: 0000000000000000
[    1.940299] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800063b6b28
[    1.940299] R13: ffff8800063b6000 R14: ffff8800063b6a88 R15: ffff8800063b6b28
[    1.940299] FS:  0000000000000000(0000) GS:ffff880007c00000(0063) knlGS:00000000f7549780
[    1.940299] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[    1.940299] CR2: 000000000a02e004 CR3: 000000000635f000 CR4: 00000000000006b0
[    1.940299] Stack:
[    1.940299]  ffff880000000000 ffff8800063b6000 ffff8800063b60f8 ffffffff81a33e00
[    1.940299]  0000000000000000 0000000000000000 ffff88000639fec8 ffffffff81164ebd
[    1.940299]  0000000000000083 ffff880006c0d600 ffff8800063a2780 ffff88000639fee8
[    1.940299] Call Trace:
[    1.940299]  [<ffffffff81164ebd>] generic_shutdown_super+0x6d/0xf0
[    1.940299]  [<ffffffff81166122>] kill_block_super+0x22/0x70
[    1.940299]  [<ffffffff81164bdc>] deactivate_locked_super+0x3c/0x60
[    1.940299]  [<ffffffff81164c5c>] deactivate_super+0x5c/0x60
[    1.940299]  [<ffffffff81183cd0>] mntput_no_expire+0x180/0x210
[    1.940299]  [<ffffffff81185757>] ? SyS_umount+0x87/0x100
[    1.940299]  [<ffffffff81185757>] SyS_umount+0x87/0x100
[    1.940299]  [<ffffffff8188e888>] sysenter_dispatch+0x7/0x2a
[    1.940299]  [<ffffffff8165e9cb>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[    1.940299] Code: b0 10 05 00 00 41 8b 87 64 ff ff ff 89 04 24 31 c0 e8 f7 ae 60 00 4d 8b 3f 4d 39 fc 75 b5 4c 3b a3 28 03 00 00 0f 84 af fe ff ff <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 4c 8d a7 90 fe
[    1.940299] RIP  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
[    1.940299]  RSP <ffff88000639fe70>
[    1.958649] ---[ end trace 6419dd181c457894 ]---
[    1.959008] Kernel panic - not syncing: Fatal exception
[    1.959568] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    1.960337] Rebooting in 1 seconds..

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: One more corrupted fs crash in ext4_put_super
  2014-10-07 20:56 ` One more corrupted fs crash in ext4_put_super Sami Liedes
@ 2014-10-07 21:57   ` Darrick J. Wong
  2014-10-07 22:22     ` Darrick J. Wong
  2014-10-09 20:15   ` Sami Liedes
  1 sibling, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2014-10-07 21:57 UTC (permalink / raw)
  To: Sami Liedes, linux-ext4, Theodore Ts'o

On Tue, Oct 07, 2014 at 11:56:43PM +0300, Sami Liedes wrote:
> Hi,
> 
> Here's one more filesystem that causes a crash in ext4_put_super on
> 3.17 both with and without the two patches from this thread applied.
> 
> Interestingly this one does not seem to crash on 3.16.4, with or
> without the patches from this thread. Even on 3.17 I *think* I've seen
> it not crash, but the reproducibility seems to be well over 95%.

Oh, I got it to crash on 3.17. :)

Does mounting with -o block_validity eliminate the backtrace, at least?  With
that option, I get this instead:

EXT4-fs error (device loop0): ext4_map_blocks:559: inode #8: block 139: comm jbd2/loop0-8: lblock 15 mapped to illegal pblock (length 1)
jbd2_journal_bmap: journal block not found at offset 15 on loop0-8

...and a journal abort.  Not nice, but at least the kernel doesn't blow up.

--D

> 
> Crashing image:
> 
>   http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.112041.min.bz2
> 
> Pristine image:
> 
>   http://www.niksula.hut.fi/~sliedes/ext4/testimg.ext4.pristine.bz2
> 
> Diff:
> 
> --- /dev/fd/63  2014-10-07 23:52:33.397018880 +0300
> +++ /dev/fd/62  2014-10-07 23:52:33.398018880 +0300
> @@ -36771,7 +36771,7 @@
>  001bd040  65 76 65 6e 74 30 00 00  b8 04 00 00 10 00 05 02  |event0..........|
>  001bd050  62 79 2d 69 64 00 00 00  bc 04 00 00 10 00 07 02  |by-id...........|
>  001bd060  62 79 2d 70 61 74 68 00  c2 04 00 00 10 00 06 03  |by-path.........|
> -001bd070  65 76 65 6e 74 35 00 00  c3 04 00 00 0c 00 04 03  |event5..........|
> +001bd070  65 76 65 6e 74 35 00 00  c3 00 00 00 0c 00 04 03  |event5..........|
>  001bd080  6d 69 63 65 c4 04 00 00  10 00 06 03 65 76 65 6e  |mice........even|
>  001bd090  74 32 00 00 c5 04 00 00  10 00 06 03 65 76 65 6e  |t2..........even|
>  001bd0a0  74 33 00 00 c6 04 00 00  5c 03 06 03 65 76 65 6e  |t3......\...even|
> 
> Backtrace:
> 
> [    1.936509] EXT4-fs (vdb): sb orphan head is 195
> [    1.936889] sb_info orphan list:
> [    1.937145]   inode vdb:195 at ffff880006675d90: mode 40755, nlink 0, next 0
> [    1.937699] ------------[ cut here ]------------
> [    1.938057] kernel BUG at fs/ext4/super.c:836!
> [    1.938419] invalid opcode: 0000 [#1] SMP
> [    1.938788] CPU: 0 PID: 1041 Comm: umount Not tainted 3.17.0+ #32
> [    1.939278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [    1.940059] task: ffff8800060bd2d0 ti: ffff88000639c000 task.ti: ffff88000639c000
> [    1.940299] RIP: 0010:[<ffffffff812753e6>]  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
> [    1.940299] RSP: 0018:ffff88000639fe70  EFLAGS: 00010287
> [    1.940299] RAX: 0000000000000040 RBX: ffff8800063b6800 RCX: 0000000000006665
> [    1.940299] RDX: 0000000000000040 RSI: 0000000000000001 RDI: 0000000000000286
> [    1.940299] RBP: ffff88000639fea0 R08: 0000000000000001 R09: 0000000000000000
> [    1.940299] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800063b6b28
> [    1.940299] R13: ffff8800063b6000 R14: ffff8800063b6a88 R15: ffff8800063b6b28
> [    1.940299] FS:  0000000000000000(0000) GS:ffff880007c00000(0063) knlGS:00000000f7549780
> [    1.940299] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [    1.940299] CR2: 000000000a02e004 CR3: 000000000635f000 CR4: 00000000000006b0
> [    1.940299] Stack:
> [    1.940299]  ffff880000000000 ffff8800063b6000 ffff8800063b60f8 ffffffff81a33e00
> [    1.940299]  0000000000000000 0000000000000000 ffff88000639fec8 ffffffff81164ebd
> [    1.940299]  0000000000000083 ffff880006c0d600 ffff8800063a2780 ffff88000639fee8
> [    1.940299] Call Trace:
> [    1.940299]  [<ffffffff81164ebd>] generic_shutdown_super+0x6d/0xf0
> [    1.940299]  [<ffffffff81166122>] kill_block_super+0x22/0x70
> [    1.940299]  [<ffffffff81164bdc>] deactivate_locked_super+0x3c/0x60
> [    1.940299]  [<ffffffff81164c5c>] deactivate_super+0x5c/0x60
> [    1.940299]  [<ffffffff81183cd0>] mntput_no_expire+0x180/0x210
> [    1.940299]  [<ffffffff81185757>] ? SyS_umount+0x87/0x100
> [    1.940299]  [<ffffffff81185757>] SyS_umount+0x87/0x100
> [    1.940299]  [<ffffffff8188e888>] sysenter_dispatch+0x7/0x2a
> [    1.940299]  [<ffffffff8165e9cb>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [    1.940299] Code: b0 10 05 00 00 41 8b 87 64 ff ff ff 89 04 24 31 c0 e8 f7 ae 60 00 4d 8b 3f 4d 39 fc 75 b5 4c 3b a3 28 03 00 00 0f 84 af fe ff ff <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 4c 8d a7 90 fe
> [    1.940299] RIP  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
> [    1.940299]  RSP <ffff88000639fe70>
> [    1.958649] ---[ end trace 6419dd181c457894 ]---
> [    1.959008] Kernel panic - not syncing: Fatal exception
> [    1.959568] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> [    1.960337] Rebooting in 1 seconds..



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: One more corrupted fs crash in ext4_put_super
  2014-10-07 21:57   ` Darrick J. Wong
@ 2014-10-07 22:22     ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2014-10-07 22:22 UTC (permalink / raw)
  To: Sami Liedes, linux-ext4, Theodore Ts'o

On Tue, Oct 07, 2014 at 02:57:40PM -0700, Darrick J. Wong wrote:
> On Tue, Oct 07, 2014 at 11:56:43PM +0300, Sami Liedes wrote:
> > Hi,
> > 
> > Here's one more filesystem that causes a crash in ext4_put_super on
> > 3.17 both with and without the two patches from this thread applied.
> > 
> > Interestingly this one does not seem to crash on 3.16.4, with or
> > without the patches from this thread. Even on 3.17 I *think* I've seen
> > it not crash, but the reproducibility seems to be well over 95%.
> 
> Oh, I got it to crash on 3.17. :)
> 
> Does mounting with -o block_validity eliminate the backtrace, at least?  With
> that option, I get this instead:
> 
> EXT4-fs error (device loop0): ext4_map_blocks:559: inode #8: block 139: comm jbd2/loop0-8: lblock 15 mapped to illegal pblock (length 1)
> jbd2_journal_bmap: journal block not found at offset 15 on loop0-8
> 
> ...and a journal abort.  Not nice, but at least the kernel doesn't blow up.

Rats, replied to the wrong crash report.  All of what I said applies to the
jbd2_commit_transaction crash, not this.

--D

> 
> --D
> 
> > 
> > Crashing image:
> > 
> >   http://www.niksula.hut.fi/~sliedes/ext4/ext4_put_super/testimg.ext4.112041.min.bz2
> > 
> > Pristine image:
> > 
> >   http://www.niksula.hut.fi/~sliedes/ext4/testimg.ext4.pristine.bz2
> > 
> > Diff:
> > 
> > --- /dev/fd/63  2014-10-07 23:52:33.397018880 +0300
> > +++ /dev/fd/62  2014-10-07 23:52:33.398018880 +0300
> > @@ -36771,7 +36771,7 @@
> >  001bd040  65 76 65 6e 74 30 00 00  b8 04 00 00 10 00 05 02  |event0..........|
> >  001bd050  62 79 2d 69 64 00 00 00  bc 04 00 00 10 00 07 02  |by-id...........|
> >  001bd060  62 79 2d 70 61 74 68 00  c2 04 00 00 10 00 06 03  |by-path.........|
> > -001bd070  65 76 65 6e 74 35 00 00  c3 04 00 00 0c 00 04 03  |event5..........|
> > +001bd070  65 76 65 6e 74 35 00 00  c3 00 00 00 0c 00 04 03  |event5..........|
> >  001bd080  6d 69 63 65 c4 04 00 00  10 00 06 03 65 76 65 6e  |mice........even|
> >  001bd090  74 32 00 00 c5 04 00 00  10 00 06 03 65 76 65 6e  |t2..........even|
> >  001bd0a0  74 33 00 00 c6 04 00 00  5c 03 06 03 65 76 65 6e  |t3......\...even|
> > 
> > Backtrace:
> > 
> > [    1.936509] EXT4-fs (vdb): sb orphan head is 195
> > [    1.936889] sb_info orphan list:
> > [    1.937145]   inode vdb:195 at ffff880006675d90: mode 40755, nlink 0, next 0
> > [    1.937699] ------------[ cut here ]------------
> > [    1.938057] kernel BUG at fs/ext4/super.c:836!
> > [    1.938419] invalid opcode: 0000 [#1] SMP
> > [    1.938788] CPU: 0 PID: 1041 Comm: umount Not tainted 3.17.0+ #32
> > [    1.939278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> > [    1.940059] task: ffff8800060bd2d0 ti: ffff88000639c000 task.ti: ffff88000639c000
> > [    1.940299] RIP: 0010:[<ffffffff812753e6>]  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
> > [    1.940299] RSP: 0018:ffff88000639fe70  EFLAGS: 00010287
> > [    1.940299] RAX: 0000000000000040 RBX: ffff8800063b6800 RCX: 0000000000006665
> > [    1.940299] RDX: 0000000000000040 RSI: 0000000000000001 RDI: 0000000000000286
> > [    1.940299] RBP: ffff88000639fea0 R08: 0000000000000001 R09: 0000000000000000
> > [    1.940299] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800063b6b28
> > [    1.940299] R13: ffff8800063b6000 R14: ffff8800063b6a88 R15: ffff8800063b6b28
> > [    1.940299] FS:  0000000000000000(0000) GS:ffff880007c00000(0063) knlGS:00000000f7549780
> > [    1.940299] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> > [    1.940299] CR2: 000000000a02e004 CR3: 000000000635f000 CR4: 00000000000006b0
> > [    1.940299] Stack:
> > [    1.940299]  ffff880000000000 ffff8800063b6000 ffff8800063b60f8 ffffffff81a33e00
> > [    1.940299]  0000000000000000 0000000000000000 ffff88000639fec8 ffffffff81164ebd
> > [    1.940299]  0000000000000083 ffff880006c0d600 ffff8800063a2780 ffff88000639fee8
> > [    1.940299] Call Trace:
> > [    1.940299]  [<ffffffff81164ebd>] generic_shutdown_super+0x6d/0xf0
> > [    1.940299]  [<ffffffff81166122>] kill_block_super+0x22/0x70
> > [    1.940299]  [<ffffffff81164bdc>] deactivate_locked_super+0x3c/0x60
> > [    1.940299]  [<ffffffff81164c5c>] deactivate_super+0x5c/0x60
> > [    1.940299]  [<ffffffff81183cd0>] mntput_no_expire+0x180/0x210
> > [    1.940299]  [<ffffffff81185757>] ? SyS_umount+0x87/0x100
> > [    1.940299]  [<ffffffff81185757>] SyS_umount+0x87/0x100
> > [    1.940299]  [<ffffffff8188e888>] sysenter_dispatch+0x7/0x2a
> > [    1.940299]  [<ffffffff8165e9cb>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> > [    1.940299] Code: b0 10 05 00 00 41 8b 87 64 ff ff ff 89 04 24 31 c0 e8 f7 ae 60 00 4d 8b 3f 4d 39 fc 75 b5 4c 3b a3 28 03 00 00 0f 84 af fe ff ff <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 4c 8d a7 90 fe
> > [    1.940299] RIP  [<ffffffff812753e6>] ext4_put_super+0x366/0x370
> > [    1.940299]  RSP <ffff88000639fe70>
> > [    1.958649] ---[ end trace 6419dd181c457894 ]---
> > [    1.959008] Kernel panic - not syncing: Fatal exception
> > [    1.959568] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> > [    1.960337] Rebooting in 1 seconds..
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: One more corrupted fs crash in ext4_put_super
  2014-10-07 20:56 ` One more corrupted fs crash in ext4_put_super Sami Liedes
  2014-10-07 21:57   ` Darrick J. Wong
@ 2014-10-09 20:15   ` Sami Liedes
  2014-10-09 20:49     ` Darrick J. Wong
  1 sibling, 1 reply; 15+ messages in thread
From: Sami Liedes @ 2014-10-09 20:15 UTC (permalink / raw)
  To: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

On Tue, Oct 07, 2014 at 11:56:43PM +0300, Sami Liedes wrote:
> Here's one more filesystem that causes a crash in ext4_put_super on
> 3.17 both with and without the two patches from this thread applied.

Ok, I bisected a bit. FWIW.

No crash on 3.16.4 + these two patches:

1c8944cbe1b ext4: add ext4_iget_normal() which is to be used for dir tree lookups
b65ad45743c ext4: don't orphan or truncate the boot loader inode

Crash on 3.17 + the above two patches.

The first commit that crashes on this test with the above patches:

# first bad commit: [908790fa3b779d37365e6b28e3aa0f6e833020c3] dcache: d_splice_alias mustn't create directory aliases

commit 908790fa3b779d37365e6b28e3aa0f6e833020c3
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Mon Feb 17 17:58:42 2014 -0500

    dcache: d_splice_alias mustn't create directory aliases

    Currently if d_splice_alias finds a directory with an alias that is not
    IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.

    Duplicate directory dentries are unacceptable; it is better just to
    error out.

    (In the case of a local filesystem the most likely case is filesystem
    corruption: for example, perhaps two directories point to the same child
    directory, and the other parent has already been found and cached.)

    Note that distributed filesystems may encounter this case in normal
    operation if a remote host moves a directory to a location different
    from the one we last cached in the dcache.  For that reason, such
    filesystems should instead use d_materialise_unique, which tries to move
    the old directory alias to the right place instead of erroring out.

    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

-- 

	Sami

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: One more corrupted fs crash in ext4_put_super
  2014-10-09 20:15   ` Sami Liedes
@ 2014-10-09 20:49     ` Darrick J. Wong
  2014-10-09 21:28       ` A very similar crash on ext2 Sami Liedes
  0 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2014-10-09 20:49 UTC (permalink / raw)
  To: Sami Liedes, linux-ext4

On Thu, Oct 09, 2014 at 11:15:41PM +0300, Sami Liedes wrote:
> On Tue, Oct 07, 2014 at 11:56:43PM +0300, Sami Liedes wrote:
> > Here's one more filesystem that causes a crash in ext4_put_super on
> > 3.17 both with and without the two patches from this thread applied.
> 
> Ok, I bisected a bit. FWIW.
> 
> No crash on 3.16.4 + these two patches:
> 
> 1c8944cbe1b ext4: add ext4_iget_normal() which is to be used for dir tree lookups
> b65ad45743c ext4: don't orphan or truncate the boot loader inode
> 
> Crash on 3.17 + the above two patches.
> 
> The first commit that crashes on this test with the above patches:

Yeah.  There's a directory that's linked twice (inode 195).  The subsequent FS
walk loads the inode into memory twice (== i_count > 2).  When you delete
everything on the FS, the inode gets put on the in-memory orphan list but for
whatever reason doesn't seem to get released via iput or something.  This means
it's still on the orphan list at umount time, which triggers the BUG.  Worse
yet, i_nlink is now 0...

...not clear what the appropriate course of action is here.  The FS is corrupt
and we need to scrape the mess off the machine.  I guess you could -EIO earlier
when you notice i_count > i_nlink?

--D

> 
> # first bad commit: [908790fa3b779d37365e6b28e3aa0f6e833020c3] dcache: d_splice_alias mustn't create directory aliases
> 
> commit 908790fa3b779d37365e6b28e3aa0f6e833020c3
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Mon Feb 17 17:58:42 2014 -0500
> 
>     dcache: d_splice_alias mustn't create directory aliases
> 
>     Currently if d_splice_alias finds a directory with an alias that is not
>     IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.
> 
>     Duplicate directory dentries are unacceptable; it is better just to
>     error out.
> 
>     (In the case of a local filesystem the most likely case is filesystem
>     corruption: for example, perhaps two directories point to the same child
>     directory, and the other parent has already been found and cached.)
> 
>     Note that distributed filesystems may encounter this case in normal
>     operation if a remote host moves a directory to a location different
>     from the one we last cached in the dcache.  For that reason, such
>     filesystems should instead use d_materialise_unique, which tries to move
>     the old directory alias to the right place instead of erroring out.
> 
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> 
> -- 
> 
> 	Sami



^ permalink raw reply	[flat|nested] 15+ messages in thread

* A very similar crash on ext2
  2014-10-09 20:49     ` Darrick J. Wong
@ 2014-10-09 21:28       ` Sami Liedes
  2014-10-21  0:28         ` Darrick J. Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Sami Liedes @ 2014-10-09 21:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 6815 bytes --]

On Thu, Oct 09, 2014 at 01:49:13PM -0700, Darrick J. Wong wrote:
> Yeah.  There's a directory that's linked twice (inode 195).  The subsequent FS
> walk loads the inode into memory twice (== i_count > 2).  When you delete
> everything on the FS, the inode gets put on the in-memory orphan list but for
> whatever reason doesn't seem to get released via iput or something.  This means
> it's still on the orphan list at umount time, which triggers the BUG.  Worse
> yet, i_nlink is now 0...
> 
> ...not clear what the appropriate course of action is here.  The FS is corrupt
> and we need to scrape the mess off the machine.  I guess you could -EIO earlier
> when you notice i_count > i_nlink?

I don't know if this is exactly the same bug, but I'm also seeing a
similar crash on ext2 which also bisected to this exact same commit
(908790fa3b). The symptoms are a bit different, though; first a VFS
warning about busy inodes after unmount, then shortly after that a
crash.

Pristine fs: http://www.niksula.hut.fi/~sliedes/ext2/testimg.ext2.bz2

Broken fs: http://www.niksula.hut.fi/~sliedes/ext2/testimg.ext2.449.min.bz2

Diff:

--- /dev/fd/63  2014-10-10 00:20:59.562913594 +0300
+++ /dev/fd/62  2014-10-10 00:20:59.562913594 +0300
@@ -9785,6 +9785,8 @@
 0080a8f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 80  |................|
 0080a900  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
 *
+0080ac20  ff ff ff ff ff ff ff ff  ff ff ff fd ff ff ff ff  |................|
+0080ac30  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
 0080ac40  ff ff 01 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 0080ac50  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *

Backtrace:

[    1.422976] VFS: Busy inodes after unmount of vdb. Self-destruct in 5 seconds.  Have a nice day...
[    1.857020] BUG: unable to handle kernel NULL pointer dereference at 0000000000000197
[    1.858178] IP: [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
[    1.859047] PGD 633a067 PUD 5171067 PMD 0
[    1.859524] Oops: 0002 [#1] SMP
[    1.859842] CPU: 0 PID: 59 Comm: kworker/u2:1 Not tainted 3.16.0+ #94
[    1.860068] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    1.860068] Workqueue: writeback bdi_writeback_workfn (flush-254:16)
[    1.860068] task: ffff8800060f2060 ti: ffff880006104000 task.ti: ffff880006104000
[    1.860068] RIP: 0010:[<ffffffff810a0859>]  [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
[    1.860068] RSP: 0018:ffff880006107b28  EFLAGS: 00010086
[    1.860068] RAX: 0000000000000000 RBX: ffff8800060f2060 RCX: 0000000000000001
[    1.860068] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8800051cb0c8
[    1.860068] RBP: ffff880006107b90 R08: 0000000000000000 R09: 0000000000000000
[    1.860068] R10: ffff8800051cb0c8 R11: 0000000000000003 R12: 0000000000000001
[    1.860068] R13: 0000000000000001 R14: ffffffffffffffff R15: 0000000000000000
[    1.860068] FS:  0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
[    1.860068] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.860068] CR2: 0000000000000197 CR3: 000000000517c000 CR4: 00000000000006b0
[    1.860068] Stack:
[    1.860068]  ffff880006107b88 ffff8800060f2770 ffffffff81170027 0000000000000096
[    1.860068]  0000000000000000 0000000000000000 ffff8800060f2770 000000000000003d
[    1.860068]  0000000000000286 0000000000000000 0000000000000001 0000000000000001
[    1.860068] Call Trace:
[    1.860068]  [<ffffffff81170027>] ? SyS_sysfs+0xf7/0x1e0
[    1.860068]  [<ffffffff810a1c46>] lock_acquire+0x96/0x130
[    1.860068]  [<ffffffff81152aaf>] ? grab_super_passive+0x3f/0x90
[    1.860068]  [<ffffffff8109e079>] down_read_trylock+0x59/0x60
[    1.860068]  [<ffffffff81152aaf>] ? grab_super_passive+0x3f/0x90
[    1.860068]  [<ffffffff81152aaf>] grab_super_passive+0x3f/0x90
[    1.860068]  [<ffffffff8117c837>] __writeback_inodes_wb+0x57/0xd0
[    1.860068]  [<ffffffff8117caeb>] wb_writeback+0x23b/0x320
[    1.860068]  [<ffffffff8117ceed>] bdi_writeback_workfn+0x1cd/0x470
[    1.860068]  [<ffffffff8107bf90>] process_one_work+0x1c0/0x580
[    1.860068]  [<ffffffff8107bf27>] ? process_one_work+0x157/0x580
[    1.860068]  [<ffffffff8107c3b3>] worker_thread+0x63/0x540
[    1.860068]  [<ffffffff8107c350>] ? process_one_work+0x580/0x580
[    1.860068]  [<ffffffff81081b81>] kthread+0xf1/0x110
[    1.860068]  [<ffffffff81081a90>] ? __kthread_parkme+0x70/0x70
[    1.860068]  [<ffffffff81850f2c>] ret_from_fork+0x7c/0xb0
[    1.860068]  [<ffffffff81081a90>] ? __kthread_parkme+0x70/0x70
[    1.860068] Code: 0b 00 00 48 c7 c7 25 cd c8 81 31 c0 e8 31 4a fc ff eb a7 0f 1f 80 00 00 00 00 44 89 f8 4d 8b 74 c2 08 4d 85 f6 0f 84 c2 fe ff ff <3e> 41 ff 86 98 01 00 00 8b 05 f1 57 96 01 44 8b bb 90 06 00 00
[    1.860068] RIP  [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
[    1.860068]  RSP <ffff880006107b28>
[    1.860068] CR2: 0000000000000197
[    1.860068] ---[ end trace 3d3d835bcb59d5fe ]---
[    1.860068] Kernel panic - not syncing: Fatal exception
[    1.860068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    1.860068] Rebooting in 1 seconds..

	Sami


> > 
> > # first bad commit: [908790fa3b779d37365e6b28e3aa0f6e833020c3] dcache: d_splice_alias mustn't create directory aliases
> > 
> > commit 908790fa3b779d37365e6b28e3aa0f6e833020c3
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Mon Feb 17 17:58:42 2014 -0500
> > 
> >     dcache: d_splice_alias mustn't create directory aliases
> > 
> >     Currently if d_splice_alias finds a directory with an alias that is not
> >     IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.
> > 
> >     Duplicate directory dentries are unacceptable; it is better just to
> >     error out.
> > 
> >     (In the case of a local filesystem the most likely case is filesystem
> >     corruption: for example, perhaps two directories point to the same child
> >     directory, and the other parent has already been found and cached.)
> > 
> >     Note that distributed filesystems may encounter this case in normal
> >     operation if a remote host moves a directory to a location different
> >     from the one we last cached in the dcache.  For that reason, such
> >     filesystems should instead use d_materialise_unique, which tries to move
> >     the old directory alias to the right place instead of erroring out.
> > 
> >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> >     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > -- 
> > 
> > 	Sami
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: A very similar crash on ext2
  2014-10-09 21:28       ` A very similar crash on ext2 Sami Liedes
@ 2014-10-21  0:28         ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2014-10-21  0:28 UTC (permalink / raw)
  To: Sami Liedes, linux-ext4

On Fri, Oct 10, 2014 at 12:28:02AM +0300, Sami Liedes wrote:
> On Thu, Oct 09, 2014 at 01:49:13PM -0700, Darrick J. Wong wrote:
> > Yeah.  There's a directory that's linked twice (inode 195).  The subsequent FS
> > walk loads the inode into memory twice (== i_count > 2).  When you delete
> > everything on the FS, the inode gets put on the in-memory orphan list but for
> > whatever reason doesn't seem to get released via iput or something.  This means
> > it's still on the orphan list at umount time, which triggers the BUG.  Worse
> > yet, i_nlink is now 0...
> > 
> > ...not clear what the appropriate course of action is here.  The FS is corrupt
> > and we need to scrape the mess off the machine.  I guess you could -EIO earlier
> > when you notice i_count > i_nlink?
> 
> I don't know if this is exactly the same bug, but I'm also seeing a
> similar crash on ext2 which also bisected to this exact same commit
> (908790fa3b). The symptoms are a bit different, though; first a VFS
> warning about busy inodes after unmount, then shortly after that a
> crash.

ext4 spits up that crash message on umount because it thinks the orphan list is
messed up... but seems to avoid blowing up.

ext2 doesn't know what an orphan list is, so it goes straight to
the VFS warning and then blows up later, probably because it tries to do
something with the (now torn down) ext2 sb.

<shrug> I had a patch that would detect rmdir of multiply linked dirs, but I
think we ought to catch that sooner, if possible.

--D

> Pristine fs: http://www.niksula.hut.fi/~sliedes/ext2/testimg.ext2.bz2
> 
> Broken fs: http://www.niksula.hut.fi/~sliedes/ext2/testimg.ext2.449.min.bz2
> 
> Diff:
> 
> --- /dev/fd/63  2014-10-10 00:20:59.562913594 +0300
> +++ /dev/fd/62  2014-10-10 00:20:59.562913594 +0300
> @@ -9785,6 +9785,8 @@
>  0080a8f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 80  |................|
>  0080a900  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>  *
> +0080ac20  ff ff ff ff ff ff ff ff  ff ff ff fd ff ff ff ff  |................|
> +0080ac30  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>  0080ac40  ff ff 01 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>  0080ac50  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>  *
> 
> Backtrace:
> 
> [    1.422976] VFS: Busy inodes after unmount of vdb. Self-destruct in 5 seconds.  Have a nice day...
> [    1.857020] BUG: unable to handle kernel NULL pointer dereference at 0000000000000197
> [    1.858178] IP: [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
> [    1.859047] PGD 633a067 PUD 5171067 PMD 0
> [    1.859524] Oops: 0002 [#1] SMP
> [    1.859842] CPU: 0 PID: 59 Comm: kworker/u2:1 Not tainted 3.16.0+ #94
> [    1.860068] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [    1.860068] Workqueue: writeback bdi_writeback_workfn (flush-254:16)
> [    1.860068] task: ffff8800060f2060 ti: ffff880006104000 task.ti: ffff880006104000
> [    1.860068] RIP: 0010:[<ffffffff810a0859>]  [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
> [    1.860068] RSP: 0018:ffff880006107b28  EFLAGS: 00010086
> [    1.860068] RAX: 0000000000000000 RBX: ffff8800060f2060 RCX: 0000000000000001
> [    1.860068] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8800051cb0c8
> [    1.860068] RBP: ffff880006107b90 R08: 0000000000000000 R09: 0000000000000000
> [    1.860068] R10: ffff8800051cb0c8 R11: 0000000000000003 R12: 0000000000000001
> [    1.860068] R13: 0000000000000001 R14: ffffffffffffffff R15: 0000000000000000
> [    1.860068] FS:  0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
> [    1.860068] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.860068] CR2: 0000000000000197 CR3: 000000000517c000 CR4: 00000000000006b0
> [    1.860068] Stack:
> [    1.860068]  ffff880006107b88 ffff8800060f2770 ffffffff81170027 0000000000000096
> [    1.860068]  0000000000000000 0000000000000000 ffff8800060f2770 000000000000003d
> [    1.860068]  0000000000000286 0000000000000000 0000000000000001 0000000000000001
> [    1.860068] Call Trace:
> [    1.860068]  [<ffffffff81170027>] ? SyS_sysfs+0xf7/0x1e0
> [    1.860068]  [<ffffffff810a1c46>] lock_acquire+0x96/0x130
> [    1.860068]  [<ffffffff81152aaf>] ? grab_super_passive+0x3f/0x90
> [    1.860068]  [<ffffffff8109e079>] down_read_trylock+0x59/0x60
> [    1.860068]  [<ffffffff81152aaf>] ? grab_super_passive+0x3f/0x90
> [    1.860068]  [<ffffffff81152aaf>] grab_super_passive+0x3f/0x90
> [    1.860068]  [<ffffffff8117c837>] __writeback_inodes_wb+0x57/0xd0
> [    1.860068]  [<ffffffff8117caeb>] wb_writeback+0x23b/0x320
> [    1.860068]  [<ffffffff8117ceed>] bdi_writeback_workfn+0x1cd/0x470
> [    1.860068]  [<ffffffff8107bf90>] process_one_work+0x1c0/0x580
> [    1.860068]  [<ffffffff8107bf27>] ? process_one_work+0x157/0x580
> [    1.860068]  [<ffffffff8107c3b3>] worker_thread+0x63/0x540
> [    1.860068]  [<ffffffff8107c350>] ? process_one_work+0x580/0x580
> [    1.860068]  [<ffffffff81081b81>] kthread+0xf1/0x110
> [    1.860068]  [<ffffffff81081a90>] ? __kthread_parkme+0x70/0x70
> [    1.860068]  [<ffffffff81850f2c>] ret_from_fork+0x7c/0xb0
> [    1.860068]  [<ffffffff81081a90>] ? __kthread_parkme+0x70/0x70
> [    1.860068] Code: 0b 00 00 48 c7 c7 25 cd c8 81 31 c0 e8 31 4a fc ff eb a7 0f 1f 80 00 00 00 00 44 89 f8 4d 8b 74 c2 08 4d 85 f6 0f 84 c2 fe ff ff <3e> 41 ff 86 98 01 00 00 8b 05 f1 57 96 01 44 8b bb 90 06 00 00
> [    1.860068] RIP  [<ffffffff810a0859>] __lock_acquire.isra.31+0x199/0xd70
> [    1.860068]  RSP <ffff880006107b28>
> [    1.860068] CR2: 0000000000000197
> [    1.860068] ---[ end trace 3d3d835bcb59d5fe ]---
> [    1.860068] Kernel panic - not syncing: Fatal exception
> [    1.860068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> [    1.860068] Rebooting in 1 seconds..
> 
> 	Sami
> 
> 
> > > 
> > > # first bad commit: [908790fa3b779d37365e6b28e3aa0f6e833020c3] dcache: d_splice_alias mustn't create directory aliases
> > > 
> > > commit 908790fa3b779d37365e6b28e3aa0f6e833020c3
> > > Author: J. Bruce Fields <bfields@redhat.com>
> > > Date:   Mon Feb 17 17:58:42 2014 -0500
> > > 
> > >     dcache: d_splice_alias mustn't create directory aliases
> > > 
> > >     Currently if d_splice_alias finds a directory with an alias that is not
> > >     IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.
> > > 
> > >     Duplicate directory dentries are unacceptable; it is better just to
> > >     error out.
> > > 
> > >     (In the case of a local filesystem the most likely case is filesystem
> > >     corruption: for example, perhaps two directories point to the same child
> > >     directory, and the other parent has already been found and cached.)
> > > 
> > >     Note that distributed filesystems may encounter this case in normal
> > >     operation if a remote host moves a directory to a location different
> > >     from the one we last cached in the dcache.  For that reason, such
> > >     filesystems should instead use d_materialise_unique, which tries to move
> > >     the old directory alias to the right place instead of erroring out.
> > > 
> > >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > >     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > > 
> > > -- 
> > > 
> > > 	Sami
> > 
> > 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-10-21  0:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-05  0:12 Intentionally corrupted ext4s causing two different kernel panics at umount Sami Liedes
2014-10-06  2:48 ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Theodore Ts'o
2014-10-06  2:48   ` [PATCH 2/2] ext4: add ext4_iget_normal() which is to be used for dir tree lookups Theodore Ts'o
2014-10-06  2:52     ` Andreas Dilger
2014-10-06  3:16       ` Theodore Ts'o
2014-10-06 15:09     ` Jan Kara
2014-10-06 18:55       ` Theodore Ts'o
2014-10-06 15:06   ` [PATCH 1/2] ext4: don't orphan or truncate the boot loader inode Jan Kara
2014-10-07 20:56 ` One more corrupted fs crash in ext4_put_super Sami Liedes
2014-10-07 21:57   ` Darrick J. Wong
2014-10-07 22:22     ` Darrick J. Wong
2014-10-09 20:15   ` Sami Liedes
2014-10-09 20:49     ` Darrick J. Wong
2014-10-09 21:28       ` A very similar crash on ext2 Sami Liedes
2014-10-21  0:28         ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).