* Re: Kernel oops when accessing to mounted, but unplugged JFS
[not found] <20101121122008.GE4024__36829.3162588545$1290342860$gmane$org@localhost>
@ 2010-11-21 17:57 ` Andi Kleen
2010-11-22 16:22 ` [Jfs-discussion] " Dave Kleikamp
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2010-11-21 17:57 UTC (permalink / raw)
To: Alexander Kolesen; +Cc: linux-fsdevel, jfs-discussion, viro, rjw
Alexander Kolesen <kolesen.a@gmail.com> writes:
[Rafael, a new regression]
> Hello.
> I've built a kernel from
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> (Date: Fri Nov 19 19:46:45 2010 -0800)
> and got a kernel oops when tried to access to unplugged,
> but mounted external usb storage formatted with JFS.
Al seems to be the last person who touched it, maybe it's
related to his changes, like this one.
commit 152a08366671080f27b32e0c411ad620c5f88b57
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sun Jul 25 00:46:55 2010 +0400
new helper: mount_bdev()
... and switch of the obvious get_sb_bdev() users to ->mount()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>
> Steps to reproduce:
> mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
> mount /dev/sdb1 /mnt/drive
> cd /mnt/drive
> touch test
> sync
> ..unplug a drive
> ls
>
>
> Result:
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000020
> IP: __mark_inode_dirty
>
> Then I got a kernel coredump. Here is a stack trace:
>
> (gdb) bt
> #0 __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> #1 0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> #2 touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> #3 0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> #4 vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> #5 0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> #6 0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> #7 0x00007f8b23d1a4c5 in ?? ()
> #8 0x00000000000002bb in ?? ()
> #9 0x0000000000000000 in ?? ()
>
> (gdb) p bdi
> $1 = (struct backing_dev_info *) 0x0
>
> (gdb) p inode->i_mapping->backing_dev_info
> $15 = (struct backing_dev_info *) 0xffff880078878d48
>
> (gdb) p inode->i_sb->s_bdi
> $16 = (struct backing_dev_info *) 0x0
>
>
> I can't do git bisect because on n'th step my system became unbootable.
> But 2.6.35 doesn't fall.
>
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
> Spend less time writing and rewriting code and more time creating great
> experiences on the web. Be a part of the beta today
> http://p.sf.net/sfu/msIE9-sfdev2dev
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
2010-11-21 17:57 ` Kernel oops when accessing to mounted, but unplugged JFS Andi Kleen
@ 2010-11-22 16:22 ` Dave Kleikamp
2010-11-22 21:20 ` Alexander Kolesen
0 siblings, 1 reply; 5+ messages in thread
From: Dave Kleikamp @ 2010-11-22 16:22 UTC (permalink / raw)
To: Andi Kleen
Cc: Alexander Kolesen, linux-fsdevel, rjw, jfs-discussion, viro,
Christoph Hellwig, Jens Axboe
On Sun, 2010-11-21 at 18:57 +0100, Andi Kleen wrote:
> Alexander Kolesen <kolesen.a@gmail.com> writes:
>
> [Rafael, a new regression]
>
> > Hello.
> > I've built a kernel from
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > (Date: Fri Nov 19 19:46:45 2010 -0800)
> > and got a kernel oops when tried to access to unplugged,
> > but mounted external usb storage formatted with JFS.
>
> Al seems to be the last person who touched it, maybe it's
> related to his changes, like this one.
>
> commit 152a08366671080f27b32e0c411ad620c5f88b57
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date: Sun Jul 25 00:46:55 2010 +0400
>
> new helper: mount_bdev()
>
> ... and switch of the obvious get_sb_bdev() users to ->mount()
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
I haven't dug too far into this yet, but I suspect that
aaead25b954879e1a708ff2f3602f494c18d20b5 is related.
commit aaead25b954879e1a708ff2f3602f494c18d20b5
Author: Christoph Hellwig <hch@lst.de>
Date: Mon Oct 4 14:25:33 2010 +0200
writeback: always use sb->s_bdi for writeback purposes
I'll have to look at what happens when a device is unplugged to see if
JFS is missing something, or it's more of a generic problem. I'm open
to suggestions from anybody on cc.
Thanks,
Shaggy
>
> >
> > Steps to reproduce:
> > mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
> > mount /dev/sdb1 /mnt/drive
> > cd /mnt/drive
> > touch test
> > sync
> > ..unplug a drive
> > ls
> >
> >
> > Result:
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000020
> > IP: __mark_inode_dirty
> >
> > Then I got a kernel coredump. Here is a stack trace:
> >
> > (gdb) bt
> > #0 __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> > #1 0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> > #2 touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> > #3 0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> > #4 vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> > #5 0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> > #6 0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> > #7 0x00007f8b23d1a4c5 in ?? ()
> > #8 0x00000000000002bb in ?? ()
> > #9 0x0000000000000000 in ?? ()
> >
> > (gdb) p bdi
> > $1 = (struct backing_dev_info *) 0x0
> >
> > (gdb) p inode->i_mapping->backing_dev_info
> > $15 = (struct backing_dev_info *) 0xffff880078878d48
> >
> > (gdb) p inode->i_sb->s_bdi
> > $16 = (struct backing_dev_info *) 0x0
> >
> >
> > I can't do git bisect because on n'th step my system became unbootable.
> > But 2.6.35 doesn't fall.
--
Dave Kleikamp
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
2010-11-22 16:22 ` [Jfs-discussion] " Dave Kleikamp
@ 2010-11-22 21:20 ` Alexander Kolesen
2010-11-23 3:41 ` Dave Kleikamp
0 siblings, 1 reply; 5+ messages in thread
From: Alexander Kolesen @ 2010-11-22 21:20 UTC (permalink / raw)
To: Dave Kleikamp
Cc: linux-fsdevel, rjw, jfs-discussion, viro, Christoph Hellwig,
Jens Axboe
> On Sun, 2010-11-21 at 18:57 +0100, Andi Kleen wrote:
> > Alexander Kolesen <kolesen.a@gmail.com> writes:
> >
> > [Rafael, a new regression]
> >
> > > Hello.
> > > I've built a kernel from
> > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > > (Date: Fri Nov 19 19:46:45 2010 -0800)
> > > and got a kernel oops when tried to access to unplugged,
> > > but mounted external usb storage formatted with JFS.
> >
> > Al seems to be the last person who touched it, maybe it's
> > related to his changes, like this one.
> >
> > commit 152a08366671080f27b32e0c411ad620c5f88b57
> > Author: Al Viro <viro@zeniv.linux.org.uk>
> > Date: Sun Jul 25 00:46:55 2010 +0400
> >
> > new helper: mount_bdev()
> >
> > ... and switch of the obvious get_sb_bdev() users to ->mount()
> >
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>
> I haven't dug too far into this yet, but I suspect that
> aaead25b954879e1a708ff2f3602f494c18d20b5 is related.
>
> commit aaead25b954879e1a708ff2f3602f494c18d20b5
> Author: Christoph Hellwig <hch@lst.de>
> Date: Mon Oct 4 14:25:33 2010 +0200
>
> writeback: always use sb->s_bdi for writeback purposes
>
> I'll have to look at what happens when a device is unplugged to see if
> JFS is missing something, or it's more of a generic problem. I'm open
> to suggestions from anybody on cc.
>
> Thanks,
> Shaggy
>
I've tested kernel before and after this commit. Yes, it reproduced after,
and didn't reproduced before.
> >
> > >
> > > Steps to reproduce:
> > > mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
> > > mount /dev/sdb1 /mnt/drive
> > > cd /mnt/drive
> > > touch test
> > > sync
> > > ..unplug a drive
> > > ls
> > >
> > >
> > > Result:
> > > BUG: unable to handle kernel NULL pointer dereference at
> > > 0000000000000020
> > > IP: __mark_inode_dirty
> > >
> > > Then I got a kernel coredump. Here is a stack trace:
> > >
> > > (gdb) bt
> > > #0 __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> > > #1 0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> > > #2 touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> > > #3 0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> > > #4 vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> > > #5 0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> > > #6 0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> > > #7 0x00007f8b23d1a4c5 in ?? ()
> > > #8 0x00000000000002bb in ?? ()
> > > #9 0x0000000000000000 in ?? ()
> > >
> > > (gdb) p bdi
> > > $1 = (struct backing_dev_info *) 0x0
> > >
> > > (gdb) p inode->i_mapping->backing_dev_info
> > > $15 = (struct backing_dev_info *) 0xffff880078878d48
> > >
> > > (gdb) p inode->i_sb->s_bdi
> > > $16 = (struct backing_dev_info *) 0x0
> > >
> > >
> > > I can't do git bisect because on n'th step my system became unbootable.
> > > But 2.6.35 doesn't fall.
> --
> Dave Kleikamp
> IBM Linux Technology Center
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
2010-11-22 21:20 ` Alexander Kolesen
@ 2010-11-23 3:41 ` Dave Kleikamp
2010-11-23 8:37 ` Christoph Hellwig
0 siblings, 1 reply; 5+ messages in thread
From: Dave Kleikamp @ 2010-11-23 3:41 UTC (permalink / raw)
To: Alexander Kolesen
Cc: jfs-discussion, Jens Axboe, rjw, viro, linux-fsdevel,
Christoph Hellwig
On Mon, 2010-11-22 at 23:20 +0200, Alexander Kolesen wrote:
> > I haven't dug too far into this yet, but I suspect that
> > aaead25b954879e1a708ff2f3602f494c18d20b5 is related.
> >
> > commit aaead25b954879e1a708ff2f3602f494c18d20b5
> > Author: Christoph Hellwig <hch@lst.de>
> > Date: Mon Oct 4 14:25:33 2010 +0200
> >
> > writeback: always use sb->s_bdi for writeback purposes
> >
> > I'll have to look at what happens when a device is unplugged to see if
> > JFS is missing something, or it's more of a generic problem. I'm open
> > to suggestions from anybody on cc.
> >
> > Thanks,
> > Shaggy
> >
>
> I've tested kernel before and after this commit. Yes, it reproduced after,
> and didn't reproduced before.
I recreated the problem on ext3 as well, so it's not specific to JFS.
I see three potential ways to fix this.
1. bdi_prune_sb() could set sb->s_bdi to &default_backing_dev_info
rather than NULL
2. inode_to_bdi() could return &default_backing_dev_info (or
inode->i_mapping->backing_dev_info) if sb->s_bdi is NULL.
3. the callers of inode_to_bdi() could check for s_bdi being NULL and
exit gracefully.
It seems that Jens and Christoph have ideas about cleaning up the bdi
stuff, so this may be a short-term fix.
Here's a patch for option 2.
---------------------------------
fs: avoid null pointer dereference when a block device is unplugged
Physically unplugging a block device when a file system is mounted can
result in sb->s_bdi being set to NULL. The callers of inode_to_bdi()
expect a non-NULL pointer. Return &default_backing_dev_info instead
of NULL.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
fs/fs-writeback.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 3d06ccc..5f8cc5d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -76,7 +76,7 @@ static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
if (strcmp(sb->s_type->name, "bdev") == 0)
return inode->i_mapping->backing_dev_info;
- return sb->s_bdi;
+ return sb->s_bdi ? sb->s_bdi : &default_backing_dev_info;
}
static inline struct inode *wb_inode(struct list_head *head)
--
Dave Kleikamp
IBM Linux Technology Center
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
2010-11-23 3:41 ` Dave Kleikamp
@ 2010-11-23 8:37 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2010-11-23 8:37 UTC (permalink / raw)
To: Dave Kleikamp
Cc: Alexander Kolesen, jfs-discussion, Jens Axboe, rjw, viro,
linux-fsdevel, Christoph Hellwig
On Mon, Nov 22, 2010 at 09:41:13PM -0600, Dave Kleikamp wrote:
> I see three potential ways to fix this.
>
> 1. bdi_prune_sb() could set sb->s_bdi to &default_backing_dev_info
> rather than NULL
> 2. inode_to_bdi() could return &default_backing_dev_info (or
> inode->i_mapping->backing_dev_info) if sb->s_bdi is NULL.
> 3. the callers of inode_to_bdi() could check for s_bdi being NULL and
> exit gracefully.
>
> It seems that Jens and Christoph have ideas about cleaning up the bdi
> stuff, so this may be a short-term fix.
It's a mess. The correct fix is to never unregister a bdi that still
has a life filesystem on it. This seems to be solved by plain removing
the unlink_gendisk call in unlink_gendisk. Unlink_gendisk just removes
the gendisk from visibility, but it still lives on as long as we have
references to it. We already have a bdi_destroy call in
blk_release_queue that should unregister the BDI once it's reference
count finally reaches zero.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-23 8:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20101121122008.GE4024__36829.3162588545$1290342860$gmane$org@localhost>
2010-11-21 17:57 ` Kernel oops when accessing to mounted, but unplugged JFS Andi Kleen
2010-11-22 16:22 ` [Jfs-discussion] " Dave Kleikamp
2010-11-22 21:20 ` Alexander Kolesen
2010-11-23 3:41 ` Dave Kleikamp
2010-11-23 8:37 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).