linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Null pointer dereference of s_chksum_driver  in ext4_chksum
@ 2015-05-01 19:48 Nikhilesh Reddy
  2015-05-03 20:11 ` Nikhilesh Reddy
  0 siblings, 1 reply; 4+ messages in thread
From: Nikhilesh Reddy @ 2015-05-01 19:48 UTC (permalink / raw)
  To: linux-ext4

Hi

I am running the 3.10 ( android ) kernel.

I have run into a couple of instances of a null pointer dereference 
occurring in the function ext4_chksum.



This issue seems to be the same one as
https://bugzilla.kernel.org/show_bug.cgi?id=82201

I am not sure if this was ever solved?

Can someone kindly point me in the right direction?

The only patch i found that might be remotely related is the
https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.14/commit/?h=LA.HB.1.1.1_rb1.10&id=9cf666834cffdb450b9b18f3e06c30493cb40ed2

I am not entirely sure if this is the fix for the issue.

Please find additional details below:

This occurred in while dereferencing the sbi->s_chksum_driver member of 
the superblock info.

This occurs during a bootup mount

  10.216919:   <6> EXT4-fs (mmcblk0p22): mounted filesystem with ordered 
data mode. Opts: barrier=1,discard
     10.225032:   <6> SELinux: initialized (dev mmcblk0p22, type ext4), 
uses xattr
     10.235901:   <6> EXT4-fs (mmcblk0p29): Ignoring removed 
nomblk_io_submit option
     10.341141:   <6> Unable to handle kernel NULL pointer dereference 
at virtual address 00000000

The call stack is as below:

                   [<ffffffc000393a54>] ext4_superblock_csum+0x20/0x68
     10.498103: <2>[<ffffffc000393fc8>]ext4_superblock_csum_set+0x20/0x34
     10.504353:   <2> [<ffffffc00039455c>] ext4_commit_super+0x178/0x1f4
     10.510170:   <2> [<ffffffc0003945f4>] save_error_info+0x1c/0x2c
     10.515638:   <2> [<ffffffc000394954>] ext4_error_inode+0x4c/0x13c
     10.521282:   <2> [<ffffffc00037d510>] ext4_map_blocks+0x354/0x398
     10.526924:   <2> [<ffffffc00037e97c>] _ext4_get_block+0xc0/0x160
     10.532479:   <2> [<ffffffc00037ea2c>] ext4_get_block+0x10/0x1c
     10.537863:   <2> [<ffffffc00031e808>] generic_block_bmap+0x34/0x44
     10.543589:   <2> [<ffffffc00037b980>] ext4_bmap+0x78/0xd4
     10.548539:   <2> [<ffffffc00030a2ec>] bmap+0x20/0x2c
     10.553052:   <2> [<ffffffc0003c8ec0>] jbd2_journal_bmap+0x24/0x9c
     10.558695:   <2> [<ffffffc0003c311c>] jread+0x54/0x228
     10.563381:   <2> [<ffffffc0003c3618>] do_one_pass+0x328/0x724
     10.568678:   <2> [<ffffffc0003c3a8c>] jbd2_journal_recover+0x78/0xdc
     10.574580:   <2> [<ffffffc0003c8c80>] jbd2_journal_load+0x154/0x308
     10.580396:   <2> [<ffffffc000398168>] ext4_fill_super+0x1984/0x2470
     10.586211:   <2> [<ffffffc0002f8634>] mount_bdev+0x134/0x1b8
     10.591420:   <2> [<ffffffc000392f18>] ext4_mount+0x10/0x1c
     10.596454:   <2> [<ffffffc0002f8ebc>] mount_fs+0x78/0x174
     10.601404:   <2> [<ffffffc00030f420>] vfs_kern_mount+0x58/0xcc
     10.606785:   <2> [<ffffffc000311748>] do_mount+0x6f0/0x7d4
     10.611819:   <2> [<ffffffc0003118b8>] SyS_mount+0x8c/0xd0
     10.616768:   <6> Code: 9100fff3 f9420000 927ae673 f942340(b9400002)
     10.622935:   <6> ---[ end trace 69fa2927148e4ec2 ]---
     10.627528:   <6> Kernel panic - not syncing: Fatal exception








-- 
Thanks
Nikhilesh Reddy

Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Null pointer dereference of s_chksum_driver  in ext4_chksum
  2015-05-01 19:48 Null pointer dereference of s_chksum_driver in ext4_chksum Nikhilesh Reddy
@ 2015-05-03 20:11 ` Nikhilesh Reddy
  2015-05-03 21:37   ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Nikhilesh Reddy @ 2015-05-03 20:11 UTC (permalink / raw)
  To: linux-ext4, tytso, darrick.wong

Hi

I am running the 3.10 ( android ) kernel.

I have run into a couple of instances of a null pointer dereference
occurring in the function ext4_chksum.



This issue seems to be the same one as
https://bugzilla.kernel.org/show_bug.cgi?id=82201

I am not sure if this was ever solved?

Can someone kindly point me in the right direction?

The only patch i found that might be remotely related is the
https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.14/commit/?h=LA.HB.1.1.1_rb1.10&id=9cf666834cffdb450b9b18f3e06c30493cb40ed2

I am not entirely sure if this is the fix for the issue.

Please find additional details below:

This occurred in while de-referencing the sbi->s_chksum_driver member of
the superblock info.

This occurs during a bootup mount

  10.216919:   <6> EXT4-fs (mmcblk0p22): mounted filesystem with ordered
data mode. Opts: barrier=1,discard
     10.225032:   <6> SELinux: initialized (dev mmcblk0p22, type ext4),
uses xattr
     10.235901:   <6> EXT4-fs (mmcblk0p29): Ignoring removed
nomblk_io_submit option
     10.341141:   <6> Unable to handle kernel NULL pointer dereference
at virtual address 00000000

The call stack is as below:

                   [<ffffffc000393a54>] ext4_superblock_csum+0x20/0x68
     10.498103: <2>[<ffffffc000393fc8>]ext4_superblock_csum_set+0x20/0x34
     10.504353:   <2> [<ffffffc00039455c>] ext4_commit_super+0x178/0x1f4
     10.510170:   <2> [<ffffffc0003945f4>] save_error_info+0x1c/0x2c
     10.515638:   <2> [<ffffffc000394954>] ext4_error_inode+0x4c/0x13c
     10.521282:   <2> [<ffffffc00037d510>] ext4_map_blocks+0x354/0x398
     10.526924:   <2> [<ffffffc00037e97c>] _ext4_get_block+0xc0/0x160
     10.532479:   <2> [<ffffffc00037ea2c>] ext4_get_block+0x10/0x1c
     10.537863:   <2> [<ffffffc00031e808>] generic_block_bmap+0x34/0x44
     10.543589:   <2> [<ffffffc00037b980>] ext4_bmap+0x78/0xd4
     10.548539:   <2> [<ffffffc00030a2ec>] bmap+0x20/0x2c
     10.553052:   <2> [<ffffffc0003c8ec0>] jbd2_journal_bmap+0x24/0x9c
     10.558695:   <2> [<ffffffc0003c311c>] jread+0x54/0x228
     10.563381:   <2> [<ffffffc0003c3618>] do_one_pass+0x328/0x724
     10.568678:   <2> [<ffffffc0003c3a8c>] jbd2_journal_recover+0x78/0xdc
     10.574580:   <2> [<ffffffc0003c8c80>] jbd2_journal_load+0x154/0x308
     10.580396:   <2> [<ffffffc000398168>] ext4_fill_super+0x1984/0x2470
     10.586211:   <2> [<ffffffc0002f8634>] mount_bdev+0x134/0x1b8
     10.591420:   <2> [<ffffffc000392f18>] ext4_mount+0x10/0x1c
     10.596454:   <2> [<ffffffc0002f8ebc>] mount_fs+0x78/0x174
     10.601404:   <2> [<ffffffc00030f420>] vfs_kern_mount+0x58/0xcc
     10.606785:   <2> [<ffffffc000311748>] do_mount+0x6f0/0x7d4
     10.611819:   <2> [<ffffffc0003118b8>] SyS_mount+0x8c/0xd0
     10.616768:   <6> Code: 9100fff3 f9420000 927ae673 f942340(b9400002)
     10.622935:   <6> ---[ end trace 69fa2927148e4ec2 ]---
     10.627528:   <6> Kernel panic - not syncing: Fatal exception


-- 
Thanks
Nikhilesh Reddy

Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Null pointer dereference of s_chksum_driver  in ext4_chksum
  2015-05-03 20:11 ` Nikhilesh Reddy
@ 2015-05-03 21:37   ` Theodore Ts'o
  2015-05-04 19:35     ` Nikhilesh Reddy
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2015-05-03 21:37 UTC (permalink / raw)
  To: Nikhilesh Reddy; +Cc: linux-ext4, darrick.wong

On Sun, May 03, 2015 at 01:11:58PM -0700, Nikhilesh Reddy wrote:
> Please find additional details below:
> 
> This occurred in while de-referencing the sbi->s_chksum_driver member of
> the superblock info.
> 
> This occurs during a bootup mount
> 
>  10.216919:   <6> EXT4-fs (mmcblk0p22): mounted filesystem with ordered
> data mode. Opts: barrier=1,discard
>     10.225032:   <6> SELinux: initialized (dev mmcblk0p22, type ext4),
> uses xattr
>     10.235901:   <6> EXT4-fs (mmcblk0p29): Ignoring removed
> nomblk_io_submit option
>     10.341141:   <6> Unable to handle kernel NULL pointer dereference
> at virtual address 00000000

I'd have to actually see the full file system to understand what is
going on, but what I suspect is happening is that the file system has
been corrupted in at least two different ways.  The first is that
there the journal inode is corrupted; this is what's causing the call
to ext4_error_inode() from a call to jbd2_journal_bmap().

The *second* thing which is going on is that before we noticed the
corrupted journal inode, the journal contained a copy of the
superblock which we replayed that _set_ the metadata checksum feature
flag.  Since it wasn't set originally when file system was initially
mounted, s_chksum_driver wasn't initialized, and this cuases the NULL
pointer deference.

Avoiding the kernel crash was fixed by accident in 3.18 with the
following commit: 9aa5d32ba269 ("Replace open coded mdata csum feature
to helper function"), since instead of actually checking to see if the
metadata checksum field is set, it uses as its primary mechanism
checking to see if s_chksum_driver is non-NULL.  There is a
WARN_ON_ONCE that will trip in the situation where the feature flag is
set and s_chksum_driver is NULL, but that really is a "should never
happen" situation.  The only scenario I can think of where this might
have happened is the one I described above, where it was enabled by a
journal replay.

This should be sufficient to avoid the crash, but I haven't had the
chance to try creating a file system corrupted the way I conjecture it
was corrupted, and see whether it we correctly fail the mount (which
is clearly what should happen if we discover a corrupted journal inode
while replaying the journal during the mount.)

      		    	    	       - Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Null pointer dereference of s_chksum_driver  in ext4_chksum
  2015-05-03 21:37   ` Theodore Ts'o
@ 2015-05-04 19:35     ` Nikhilesh Reddy
  0 siblings, 0 replies; 4+ messages in thread
From: Nikhilesh Reddy @ 2015-05-04 19:35 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, darrick.wong

Thank you so much Ted.
Will try to identify the root cause of the journal inode corruption.

And thanks for pointing me to the commit: 9aa5d32ba269 .

--
Thanks
Nikhilesh Reddy

Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-04 19:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-01 19:48 Null pointer dereference of s_chksum_driver in ext4_chksum Nikhilesh Reddy
2015-05-03 20:11 ` Nikhilesh Reddy
2015-05-03 21:37   ` Theodore Ts'o
2015-05-04 19:35     ` Nikhilesh Reddy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).