linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Misc ext4 fixes
@ 2013-06-03 10:00 Ruslan Bilovol
  2013-06-03 10:00 ` [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset() Ruslan Bilovol
  2013-06-03 10:00 ` [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry Ruslan Bilovol
  0 siblings, 2 replies; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-03 10:00 UTC (permalink / raw)
  To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel

Hello guys,

Here are few patches that add sanity checks
before dereferencing some pointers inside of
ext4/jbd2 functions and help to avoid some crashes
on systems under high load and limited resources.
I met them in next usecase: writing a high-bitrate
video to ext4 partition from HD webcam. The issues
appear very rare.
I do not know if these issues are already fixed in some
maintainers repo, but they are still applicable
to 3.10-rc4 tag on which they are based on.
I understand that just sanity checking may not be
enough and the root cause may be in some other
place, so I hope the patches will be reviewed
by ext4 guru as well.

Regards,
Ruslan

Ruslan Bilovol (2):
  jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer
    before memset()
  ext4: add sanity checks in __ext4_check_dir_entry

 fs/ext4/dir.c     |    8 ++++++--
 fs/jbd2/journal.c |    2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset()
  2013-06-03 10:00 [PATCH 0/2] Misc ext4 fixes Ruslan Bilovol
@ 2013-06-03 10:00 ` Ruslan Bilovol
  2013-06-03 15:33   ` Theodore Ts'o
  2013-06-03 10:00 ` [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry Ruslan Bilovol
  1 sibling, 1 reply; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-03 10:00 UTC (permalink / raw)
  To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel

The memset() doesn't perform any NULL-pointer checking
before dereferencing passed pointer so this should be
checked before calling it.

This fixes next issue:

[38200.069122] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[38200.078002] pgd = c0004000
[38200.080963] [00000000] *pgd=00000000
[38200.084991] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[38200.091003] Modules linked in: rproc_drm(O) tf_driver(O) gps_drv wl18xx(O) wl12xx(O) wlcore(O) mac80211(O) cfg80211(O) pvrsrvkm_sgx540_120(O) compat(O)
[38200.106719] CPU: 1    Tainted: G        W  O  (3.4.34 #1)
[38200.112579] PC is at __memzero+0x24/0x80
[38200.116882] LR is at 0x0
[38200.119689] pc : [<c023b004>]    lr : [<00000000>]    psr: 28000113
[38200.119689] sp : d66b1e2c  ip : 00000000  fp : d66b1e54
[38200.132171] r10: 00000000  r9 : d6ad48c0  r8 : c01bd414
[38200.137847] r7 : 00000000  r6 : ffffffff  r5 : cb19fe48  r4 : d678bc00
[38200.144958] r3 : 00000000  r2 : 00000000  r1 : 00000fc0  r0 : 00000000
[38200.152008] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[38200.160034] Control: 10c5387d  Table: 967b004a  DAC: 00000015
[...]
[38200.888031] Backtrace:
[38200.890869] [<c01c509c>] (jbd2_journal_get_descriptor_buffer+0x0/0xa4) from [<c01bddf0>] (jbd2_journal_commit_transaction+0x994/0x18f4)
[38200.903930]  r5:d6ad5348 r4:d678bc00
[38200.907989] [<c01bd45c>] (jbd2_journal_commit_transaction+0x0/0x18f4) from [<c01c2e70>] (kjournald2+0xb4/0x24c)
[38200.918884] [<c01c2dbc>] (kjournald2+0x0/0x24c) from [<c0066990>] (kthread+0x90/0x9c)
[38200.927429] [<c0066900>] (kthread+0x0/0x9c) from [<c004a968>] (do_exit+0x0/0x804)
[38200.935577]  r6:c004a968 r5:c0066900 r4:d6749c8c
[38200.940887] Code: e52de004 e1a0c002 e1a0e002 e2511040 (a8a0500c)

Signed-off-by: Ruslan Bilovol <ruslan.bilovol@ti.com>
---
 fs/jbd2/journal.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 9545757..48f3da5 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -810,7 +810,7 @@ struct journal_head *jbd2_journal_get_descriptor_buffer(journal_t *journal)
 		return NULL;
 
 	bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
-	if (!bh)
+	if (!bh || !bh->b_data)
 		return NULL;
 	lock_buffer(bh);
 	memset(bh->b_data, 0, journal->j_blocksize);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry
  2013-06-03 10:00 [PATCH 0/2] Misc ext4 fixes Ruslan Bilovol
  2013-06-03 10:00 ` [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset() Ruslan Bilovol
@ 2013-06-03 10:00 ` Ruslan Bilovol
  2013-06-03 15:40   ` Theodore Ts'o
  1 sibling, 1 reply; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-03 10:00 UTC (permalink / raw)
  To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel

Added checks for NULL before dereferencing some pointers

This fixes next issue:

[ 1531.530609] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 1531.543151] pgd = d6068000
[ 1531.546112] [00000004] *pgd=00000000
[ 1531.550109] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1531.555816] Modules linked in: rproc_drm(O) tf_driver(O) gps_drv wl18xx(O) wl12xx(O) wlcore(O) mac80211(O) cfg80211(O) pvrsrvkm_sgx540_120(O) compat(O)
[ 1531.571105] CPU: 0    Tainted: G        W  O  (3.4.34 #1)
[ 1531.576934] PC is at __ext4_check_dir_entry+0x24/0x1a4
[ 1531.582550] LR is at ext4_readdir+0x270/0x7fc
[ 1531.587249] pc : [<c017a788>]    lr : [<c017ab78>]    psr: 80000113
[ 1531.587249] sp : d6051e50  ip : 00000000  fp : d6051eb4
[ 1531.599700] r10: d4f0fdb0  r9 : c06a5c54  r8 : 000000d5
[ 1531.605377] r7 : d4c54540  r6 : d4634948  r5 : 00000000  r4 : 00000000
[ 1531.612396] r3 : d4634948  r2 : d4f0fdb0  r1 : 000000d5  r0 : c06a5c54
[ 1531.619476] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1531.627166] Control: 10c5387d  Table: 9606804a  DAC: 00000015
[...]
[ 1532.654876] Backtrace:
[ 1532.657653] [<c017a764>] (__ext4_check_dir_entry+0x0/0x1a4) from [<c017ab78>] (ext4_readdir+0x270/0x7fc)
[ 1532.667968] [<c017a908>] (ext4_readdir+0x0/0x7fc) from [<c0124860>] (vfs_readdir+0x9c/0xc0)
[ 1532.677062] [<c01247c4>] (vfs_readdir+0x0/0xc0) from [<c0124a0c>] (sys_getdents64+0x68/0xc0)
[ 1532.686248] [<c01249a4>] (sys_getdents64+0x0/0xc0) from [<c0013680>] (ret_fast_syscall+0x0/0x30)
[ 1532.695800]  r7:000000d9 r6:00000000 r5:416d4168 r4:416d4158
[ 1532.702270] Code: e1a07003 e1a09000 e1a08001 e59b3008 (e1dc40b4)

Signed-off-by: Ruslan Bilovol <ruslan.bilovol@ti.com>
---
 fs/ext4/dir.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index f8d56e4..cf0875b 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -68,8 +68,12 @@ int __ext4_check_dir_entry(const char *function, unsigned int line,
 			   unsigned int offset)
 {
 	const char *error_msg = NULL;
-	const int rlen = ext4_rec_len_from_disk(de->rec_len,
-						dir->i_sb->s_blocksize);
+	int rlen;
+
+	if (!de || !bh || !dir)
+		return 1;
+
+	rlen = ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize);
 
 	if (unlikely(rlen < EXT4_DIR_REC_LEN(1)))
 		error_msg = "rec_len is smaller than minimal";
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset()
  2013-06-03 10:00 ` [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset() Ruslan Bilovol
@ 2013-06-03 15:33   ` Theodore Ts'o
  2013-06-04 11:15     ` Ruslan Bilovol
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2013-06-03 15:33 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: adilger.kernel, linux-ext4, linux-kernel

On Mon, Jun 03, 2013 at 01:00:15PM +0300, Ruslan Bilovol wrote:
> The memset() doesn't perform any NULL-pointer checking
> before dereferencing passed pointer so this should be
> checked before calling it.

I can see that __getblk() can return NULL if there is a memory
allocation failure (and is defined to do so), so checking to make sure
bh is not NULL is a good thing to do.

Have you actually seen a case where bh is non-NULL, but bh->b_data is
NULL?  If not, it might be better to do something like this:

>  	bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
	if (!bh)
		return NULL;
	BUG_ON(!bh->b_data);

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry
  2013-06-03 10:00 ` [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry Ruslan Bilovol
@ 2013-06-03 15:40   ` Theodore Ts'o
  2013-06-04 11:17     ` Ruslan Bilovol
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2013-06-03 15:40 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: adilger.kernel, linux-ext4, linux-kernel

On Mon, Jun 03, 2013 at 01:00:16PM +0300, Ruslan Bilovol wrote:
> Added checks for NULL before dereferencing some pointers
> 
> +	if (!de || !bh || !dir)
> +		return 1;

Do you know which one of these pointers was NULL?

I want't make sure we're fixing the root issue as opposed to papering
over it.  I'm not seeing how any of these pointers could have been
NULL (at least in my upstream kernel), and while I believe that this
helped for your kernel, I want to make sure we understand exactly what
was going on.

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset()
  2013-06-03 15:33   ` Theodore Ts'o
@ 2013-06-04 11:15     ` Ruslan Bilovol
  2013-06-04 13:37       ` Theodore Ts'o
  0 siblings, 1 reply; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-04 11:15 UTC (permalink / raw)
  To: Theodore Ts'o, Ruslan Bilovol, adilger.kernel, linux-ext4,
	linux-kernel

Hi Ted,

On Mon, Jun 3, 2013 at 6:33 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Mon, Jun 03, 2013 at 01:00:15PM +0300, Ruslan Bilovol wrote:
>> The memset() doesn't perform any NULL-pointer checking
>> before dereferencing passed pointer so this should be
>> checked before calling it.
>
> I can see that __getblk() can return NULL if there is a memory
> allocation failure (and is defined to do so), so checking to make sure
> bh is not NULL is a good thing to do.
>
> Have you actually seen a case where bh is non-NULL, but bh->b_data is
> NULL?  If not, it might be better to do something like this:

Yes, this is exactly the situation I observe (bh is non-NULL, but
bh->b_data is NULL)

>
>>       bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
>         if (!bh)
>                 return NULL;
>         BUG_ON(!bh->b_data);

Is it so critical that we need to stop the kernel here?
Can we recover from this state gracefully?
Maybe something like this may be better:

       bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
         if (!bh)
                 return NULL;
         if(!bh->b_data) {
                 WARN_ON(1);
                 return NULL;
         }

Regards,
Ruslan

>
>                                                 - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Best regards,
Ruslan Bilvol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry
  2013-06-03 15:40   ` Theodore Ts'o
@ 2013-06-04 11:17     ` Ruslan Bilovol
  0 siblings, 0 replies; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-04 11:17 UTC (permalink / raw)
  To: Theodore Ts'o, Ruslan Bilovol, adilger.kernel, linux-ext4,
	linux-kernel

Hi Ted,

On Mon, Jun 3, 2013 at 6:40 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Mon, Jun 03, 2013 at 01:00:16PM +0300, Ruslan Bilovol wrote:
>> Added checks for NULL before dereferencing some pointers
>>
>> +     if (!de || !bh || !dir)
>> +             return 1;
>
> Do you know which one of these pointers was NULL?

In my particular case, the ''de" pointer is NULL

Regards,
Ruslan

>
> I want't make sure we're fixing the root issue as opposed to papering
> over it.  I'm not seeing how any of these pointers could have been
> NULL (at least in my upstream kernel), and while I believe that this
> helped for your kernel, I want to make sure we understand exactly what
> was going on.
>
> Thanks,
>
>                                         - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Best regards,
Ruslan Bilvol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset()
  2013-06-04 11:15     ` Ruslan Bilovol
@ 2013-06-04 13:37       ` Theodore Ts'o
  2013-06-06  8:02         ` Ruslan Bilovol
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2013-06-04 13:37 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: adilger.kernel, linux-ext4, linux-kernel

On Tue, Jun 04, 2013 at 02:15:57PM +0300, Ruslan Bilovol wrote:
> > Have you actually seen a case where bh is non-NULL, but bh->b_data is
> > NULL?  If not, it might be better to do something like this:
> 
> Yes, this is exactly the situation I observe (bh is non-NULL, but
> bh->b_data is NULL)

Hmm... so the stack trace you sent in the commit description was one
where bh->b_data was NULL?  I'm trying to make sure there isn't
something else going on that we don't understand.

Could you put some instrumentation in __find_get_block()?  Something like this:

struct buffer_head *
__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
{
	struct buffer_head *bh = lookup_bh_lru(bdev, block, size);

	if (bh == NULL) {
		bh = __find_get_block_slow(bdev, block);
		if (bh->b_data == NULL) {
		   pr_crit("b_data NULL after find_get_block_slow\n);
		   WARN_ON(1);
		}
		if (bh)
			bh_lru_install(bh);
	} else {
		if (bh->b_data == NULL) {
			pr_crit("b_data NULL after lookup_bh_lru\n");
			WARN_ON(1);
		}
	}
	if (bh)
		touch_buffer(bh);
	return bh;
}

... and then send me the stack trace after running your reproduction
case.  If it turns out the problem is in __find_get_block_slow(),
could you put in similar debugging checks there and try to track it
down?

I'm pretty sure the case of bh non-NULL and bh->b_data NULL is never
supposed to happen, and while we could just put a check where you
suggested, there are plenty of other places which use __getblk(), and
there may be other bugs that are hiding here.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset()
  2013-06-04 13:37       ` Theodore Ts'o
@ 2013-06-06  8:02         ` Ruslan Bilovol
  0 siblings, 0 replies; 9+ messages in thread
From: Ruslan Bilovol @ 2013-06-06  8:02 UTC (permalink / raw)
  To: Theodore Ts'o, Ruslan Bilovol, adilger.kernel, linux-ext4,
	linux-kernel

Hi Ted,

On Tue, Jun 4, 2013 at 4:37 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Jun 04, 2013 at 02:15:57PM +0300, Ruslan Bilovol wrote:
>> > Have you actually seen a case where bh is non-NULL, but bh->b_data is
>> > NULL?  If not, it might be better to do something like this:
>>
>> Yes, this is exactly the situation I observe (bh is non-NULL, but
>> bh->b_data is NULL)
>
> Hmm... so the stack trace you sent in the commit description was one
> where bh->b_data was NULL?  I'm trying to make sure there isn't
> something else going on that we don't understand.
>
> Could you put some instrumentation in __find_get_block()?  Something like this:
>
> struct buffer_head *
> __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
> {
>         struct buffer_head *bh = lookup_bh_lru(bdev, block, size);
>
>         if (bh == NULL) {
>                 bh = __find_get_block_slow(bdev, block);
>                 if (bh->b_data == NULL) {
>                    pr_crit("b_data NULL after find_get_block_slow\n);
>                    WARN_ON(1);
>                 }
>                 if (bh)
>                         bh_lru_install(bh);
>         } else {
>                 if (bh->b_data == NULL) {
>                         pr_crit("b_data NULL after lookup_bh_lru\n");
>                         WARN_ON(1);
>                 }
>         }
>         if (bh)
>                 touch_buffer(bh);
>         return bh;
> }
>
> ... and then send me the stack trace after running your reproduction
> case.  If it turns out the problem is in __find_get_block_slow(),
> could you put in similar debugging checks there and try to track it
> down?
>
> I'm pretty sure the case of bh non-NULL and bh->b_data NULL is never
> supposed to happen, and while we could just put a check where you
> suggested, there are plenty of other places which use __getblk(), and
> there may be other bugs that are hiding here.

Yes agree, that's what I told about in my cover letter fir this patch series.
I will debug it with code you mentioned, but the issue appears
very rarely, so I need at lease few days for catching this..

Regards,
Ruslan

>
> Regards,
>
>                                                 - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Best regards,
Ruslan Bilvol

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-06-06  8:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-03 10:00 [PATCH 0/2] Misc ext4 fixes Ruslan Bilovol
2013-06-03 10:00 ` [PATCH 1/2] jbd2: check bh->b_data for NULL in jbd2_journal_get_descriptor_buffer before memset() Ruslan Bilovol
2013-06-03 15:33   ` Theodore Ts'o
2013-06-04 11:15     ` Ruslan Bilovol
2013-06-04 13:37       ` Theodore Ts'o
2013-06-06  8:02         ` Ruslan Bilovol
2013-06-03 10:00 ` [PATCH 2/2] ext4: add sanity checks in __ext4_check_dir_entry Ruslan Bilovol
2013-06-03 15:40   ` Theodore Ts'o
2013-06-04 11:17     ` Ruslan Bilovol

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).