From: pg_lkm@lkm.for.sabi.co.UK (Peter Grandi)
To: Linux kernel <linux-kernel@vger.kernel.org>
Subject: oops while doing block IO
Date: Fri, 25 Aug 2006 14:02:07 +0100 [thread overview]
Message-ID: <yf3wt8xm0j4.fsf@base.gp.example.com> (raw)
With 2.16.17 I get infrequent but very consistent null pointer
dereferences especially but not only when lots of block IO is
going on (I backup with disk-to-disk copies).
Some more context: I use 'loop-AES' (may or may not be relevant),
and the error happens most often when lots of _concurrent_ block
IO takes place; usually mere single threaded copying from one disk
to another does not trigger the error.
The relevant details are:
------------------------------------------------------------------------
base kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040
------------------------------------------------------------------------
EIP is at generic_make_request+0x31/0x330
eax: 00000000 ebx: 00000200 ecx: a1a23480 edx: 95aa34c0
esi: 95aa34c0 edi: 00000000 ebp: b7f14cec esp: b7f14c64
ds: 007b es: 007b ss: 0068
Process pdflush (pid: 166, threadinfo=b7f14000 task=b7f15ab0)
------------------------------------------------------------------------
Call Trace:
<7810402d> show_stack_log_lvl+0x9d/0xd0 <78104267> show_registers+0x1b7/0x240
<78104422> die+0x132/0x330 <78116086> do_page_fault+0x296/0x6bc
<78103a9f> error_code+0x4f/0x54 <78235a18> submit_bio+0x58/0x100
<78169be3> submit_bh+0xd3/0x130 <7816b744> __block_write_full_page+0x1c4/0x3b0
<7816bd92> block_write_full_page+0x102/0x110 <78170560> blkdev_writepage+0x20/0x30
<7818f688> mpage_writepages+0x1b8/0x3d0 <78170510> generic_writepages+0x20/0x30
<7814e13d> do_writepages+0x2d/0x50 <7818da54> __writeback_single_inode+0x94/0x3c0
<7818e04b> sync_sb_inodes+0x1bb/0x2a0 <7818e6b3> writeback_inodes+0xc3/0xf9
<7814e4b0> background_writeout+0x80/0xa0 <7814ecde> pdflush+0xee/0x1b0
<78133385> kthread+0xc5/0xf0 <78100dd5> kernel_thread_helper+0x5/0x10
------------------------------------------------------------------------
The relevant bit of code is:
------------------------------------------------------------------------
0x7823338e <generic_make_request+30>: call 0x78118410 <__might_sleep>
0x78233393 <generic_make_request+35>: call 0x783ea290 <cond_resched>
0x78233398 <generic_make_request+40>: mov 0x8(%ebp),%edx
0x7823339b <generic_make_request+43>: mov 0x8(%edx),%ecx
0x7823339e <generic_make_request+46>: mov 0x4(%ecx),%eax
0x782333a1 <generic_make_request+49>: mov 0x40(%eax),%edx
0x782333a4 <generic_make_request+52>: mov 0x3c(%eax),%eax
0x782333a7 <generic_make_request+55>: shrd $0x9,%edx,%eax
------------------------------------------------------------------------
and the null pointer is clearly in '%eax', which corresponds to
'->bd_inode' at the beginning of 'generic_make_request' in
'block/ll_rw_blk.c':
------------------------------------------------------------------------
might_sleep();
/* Test device or partition size, when known. */
maxsector = bio->bi_bdev->bd_inode->i_size >> 9;
if (maxsector) {
------------------------------------------------------------------------
As a crude fix I have done this patch:
------------------------------------------------------------------------
--- block/ll_rw_blk.c-dist 2006-07-26 12:35:54.918926000 +0100
+++ block/ll_rw_blk.c 2006-08-24 11:55:01.806241906 +0100
@@ -80,6 +80,16 @@
#define BLK_BATCH_REQ 32
+
+/*
+ * Return the maximum number of sectors for the block device,
+ * or 0 if unknown.
+ */
+static inline sector_t bio_max_sector(const struct bio *const bio)
+{
+ return (bio->bi_bdev->bd_inode == 0) ? 0
+ : bio->bi_bdev->bd_inode->i_size >> 9;
+}
/*
* Return the threshold (number of used requests) at which the queue is
* considered to be congested. It include a little hysteresis to keep the
* context switch rate down.
@@ -2983,7 +2993,7 @@
bdevname(bio->bi_bdev, b),
bio->bi_rw,
(unsigned long long)bio->bi_sector + bio_sectors(bio),
- (long long)(bio->bi_bdev->bd_inode->i_size >> 9));
+ (long long) bio_max_sector(bio));
set_bit(BIO_EOF, &bio->bi_flags);
}
@@ -3021,7 +3031,7 @@
might_sleep();
/* Test device or partition size, when known. */
- maxsector = bio->bi_bdev->bd_inode->i_size >> 9;
+ maxsector = bio_max_sector(bio);
if (maxsector) {
sector_t sector = bio->bi_sector;
------------------------------------------------------------------------
Which just prevents the null pointer dereference. The wider
question is why 'bd_inode' is null, but only in some cases...
reply other threads:[~2006-08-25 13:02 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=yf3wt8xm0j4.fsf@base.gp.example.com \
--to=pg_lkm@lkm.for.sabi.co.uk \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.