linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Kernel oops when accessing to mounted, but unplugged JFS
       [not found] <20101121122008.GE4024__36829.3162588545$1290342860$gmane$org@localhost>
@ 2010-11-21 17:57 ` Andi Kleen
  2010-11-22 16:22   ` [Jfs-discussion] " Dave Kleikamp
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2010-11-21 17:57 UTC (permalink / raw)
  To: Alexander Kolesen; +Cc: linux-fsdevel, jfs-discussion, viro, rjw

Alexander Kolesen <kolesen.a@gmail.com> writes:

[Rafael, a new regression]

> Hello.
> I've built a kernel from
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> (Date:   Fri Nov 19 19:46:45 2010 -0800)
> and got a kernel oops when tried to access to unplugged,
> but mounted external usb storage formatted with JFS.

Al seems to be the last person who touched it, maybe it's 
related to his changes, like this one. 

commit 152a08366671080f27b32e0c411ad620c5f88b57
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sun Jul 25 00:46:55 2010 +0400

    new helper: mount_bdev()

    ... and switch of the obvious get_sb_bdev() users to ->mount()

    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


>  
> Steps to reproduce:
>  mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
>  mount /dev/sdb1 /mnt/drive
>  cd /mnt/drive
>  touch test
>  sync
>  ..unplug a drive
>  ls
>   
>
> Result:
>    BUG: unable to handle kernel NULL pointer dereference at
>    0000000000000020
>    IP: __mark_inode_dirty
>
> Then I got a kernel coredump. Here is a stack trace:
>
> (gdb) bt
> #0  __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> #1  0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> #2  touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> #3  0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> #4  vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> #5  0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> #6  0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> #7  0x00007f8b23d1a4c5 in ?? ()
> #8  0x00000000000002bb in ?? ()
> #9  0x0000000000000000 in ?? ()
>
> (gdb) p bdi
> $1 = (struct backing_dev_info *) 0x0
>
> (gdb) p inode->i_mapping->backing_dev_info 
> $15 = (struct backing_dev_info *) 0xffff880078878d48
>
> (gdb) p inode->i_sb->s_bdi
> $16 = (struct backing_dev_info *) 0x0
>
>
> I can't do git bisect because on n'th step my system became unbootable. 
> But 2.6.35 doesn't fall.
>
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today
> http://p.sf.net/sfu/msIE9-sfdev2dev

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
  2010-11-21 17:57 ` Kernel oops when accessing to mounted, but unplugged JFS Andi Kleen
@ 2010-11-22 16:22   ` Dave Kleikamp
  2010-11-22 21:20     ` Alexander Kolesen
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Kleikamp @ 2010-11-22 16:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alexander Kolesen, linux-fsdevel, rjw, jfs-discussion, viro,
	Christoph Hellwig, Jens Axboe

On Sun, 2010-11-21 at 18:57 +0100, Andi Kleen wrote:
> Alexander Kolesen <kolesen.a@gmail.com> writes:
> 
> [Rafael, a new regression]
> 
> > Hello.
> > I've built a kernel from
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > (Date:   Fri Nov 19 19:46:45 2010 -0800)
> > and got a kernel oops when tried to access to unplugged,
> > but mounted external usb storage formatted with JFS.
> 
> Al seems to be the last person who touched it, maybe it's 
> related to his changes, like this one. 
> 
> commit 152a08366671080f27b32e0c411ad620c5f88b57
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Sun Jul 25 00:46:55 2010 +0400
> 
>     new helper: mount_bdev()
> 
>     ... and switch of the obvious get_sb_bdev() users to ->mount()
> 
>     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

I haven't dug too far into this yet, but I suspect that
aaead25b954879e1a708ff2f3602f494c18d20b5 is related.

commit aaead25b954879e1a708ff2f3602f494c18d20b5
Author: Christoph Hellwig <hch@lst.de>
Date:   Mon Oct 4 14:25:33 2010 +0200

    writeback: always use sb->s_bdi for writeback purposes

I'll have to look at what happens when a device is unplugged to see if
JFS is missing something, or it's more of a generic problem.  I'm open
to suggestions from anybody on cc.

Thanks,
Shaggy

> 
> >  
> > Steps to reproduce:
> >  mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
> >  mount /dev/sdb1 /mnt/drive
> >  cd /mnt/drive
> >  touch test
> >  sync
> >  ..unplug a drive
> >  ls
> >   
> >
> > Result:
> >    BUG: unable to handle kernel NULL pointer dereference at
> >    0000000000000020
> >    IP: __mark_inode_dirty
> >
> > Then I got a kernel coredump. Here is a stack trace:
> >
> > (gdb) bt
> > #0  __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> > #1  0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> > #2  touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> > #3  0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> > #4  vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> > #5  0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> > #6  0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> > #7  0x00007f8b23d1a4c5 in ?? ()
> > #8  0x00000000000002bb in ?? ()
> > #9  0x0000000000000000 in ?? ()
> >
> > (gdb) p bdi
> > $1 = (struct backing_dev_info *) 0x0
> >
> > (gdb) p inode->i_mapping->backing_dev_info 
> > $15 = (struct backing_dev_info *) 0xffff880078878d48
> >
> > (gdb) p inode->i_sb->s_bdi
> > $16 = (struct backing_dev_info *) 0x0
> >
> >
> > I can't do git bisect because on n'th step my system became unbootable. 
> > But 2.6.35 doesn't fall.
-- 
Dave Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
  2010-11-22 16:22   ` [Jfs-discussion] " Dave Kleikamp
@ 2010-11-22 21:20     ` Alexander Kolesen
  2010-11-23  3:41       ` Dave Kleikamp
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Kolesen @ 2010-11-22 21:20 UTC (permalink / raw)
  To: Dave Kleikamp
  Cc: linux-fsdevel, rjw, jfs-discussion, viro, Christoph Hellwig,
	Jens Axboe

> On Sun, 2010-11-21 at 18:57 +0100, Andi Kleen wrote:
> > Alexander Kolesen <kolesen.a@gmail.com> writes:
> > 
> > [Rafael, a new regression]
> > 
> > > Hello.
> > > I've built a kernel from
> > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > > (Date:   Fri Nov 19 19:46:45 2010 -0800)
> > > and got a kernel oops when tried to access to unplugged,
> > > but mounted external usb storage formatted with JFS.
> > 
> > Al seems to be the last person who touched it, maybe it's 
> > related to his changes, like this one. 
> > 
> > commit 152a08366671080f27b32e0c411ad620c5f88b57
> > Author: Al Viro <viro@zeniv.linux.org.uk>
> > Date:   Sun Jul 25 00:46:55 2010 +0400
> > 
> >     new helper: mount_bdev()
> > 
> >     ... and switch of the obvious get_sb_bdev() users to ->mount()
> > 
> >     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> 
> I haven't dug too far into this yet, but I suspect that
> aaead25b954879e1a708ff2f3602f494c18d20b5 is related.
> 
> commit aaead25b954879e1a708ff2f3602f494c18d20b5
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Mon Oct 4 14:25:33 2010 +0200
> 
>     writeback: always use sb->s_bdi for writeback purposes
> 
> I'll have to look at what happens when a device is unplugged to see if
> JFS is missing something, or it's more of a generic problem.  I'm open
> to suggestions from anybody on cc.
> 
> Thanks,
> Shaggy
>

I've tested kernel before and after this commit. Yes, it reproduced after,
and didn't reproduced before.

> > 
> > >  
> > > Steps to reproduce:
> > >  mkfs.jfs /dev/sdb1 (unpluggable USB hard drive)
> > >  mount /dev/sdb1 /mnt/drive
> > >  cd /mnt/drive
> > >  touch test
> > >  sync
> > >  ..unplug a drive
> > >  ls
> > >   
> > >
> > > Result:
> > >    BUG: unable to handle kernel NULL pointer dereference at
> > >    0000000000000020
> > >    IP: __mark_inode_dirty
> > >
> > > Then I got a kernel coredump. Here is a stack trace:
> > >
> > > (gdb) bt
> > > #0  __mark_inode_dirty (inode=0xffff880078fc1490, flags=<value optimized out>) at fs/fs-writeback.c:990
> > > #1  0xffffffff810e4990 in mark_inode_dirty_sync (mnt=0xffff88007862fd00, dentry=<value optimized out>) at include/linux/fs.h:1687
> > > #2  touch_atime (mnt=0xffff88007862fd00, dentry=<value optimized out>) at fs/inode.c:1505
> > > #3  0xffffffff810dfeb4 in file_accessed (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at include/linux/fs.h:1763
> > > #4  vfs_readdir (file=0xffff88006afed600, filler=0xffffffff810dfcf8 <filldir>, buf=0xffff88007762bf38) at fs/readdir.c:41
> > > #5  0xffffffff810e001a in sys_getdents (fd=<value optimized out>, dirent=0x1f61468, count=32768) at fs/readdir.c:214
> > > #6  0xffffffff810279ab in ?? () at arch/x86/kernel/entry_64.S:479
> > > #7  0x00007f8b23d1a4c5 in ?? ()
> > > #8  0x00000000000002bb in ?? ()
> > > #9  0x0000000000000000 in ?? ()
> > >
> > > (gdb) p bdi
> > > $1 = (struct backing_dev_info *) 0x0
> > >
> > > (gdb) p inode->i_mapping->backing_dev_info 
> > > $15 = (struct backing_dev_info *) 0xffff880078878d48
> > >
> > > (gdb) p inode->i_sb->s_bdi
> > > $16 = (struct backing_dev_info *) 0x0
> > >
> > >
> > > I can't do git bisect because on n'th step my system became unbootable. 
> > > But 2.6.35 doesn't fall.
> -- 
> Dave Kleikamp
> IBM Linux Technology Center
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
  2010-11-22 21:20     ` Alexander Kolesen
@ 2010-11-23  3:41       ` Dave Kleikamp
  2010-11-23  8:37         ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Kleikamp @ 2010-11-23  3:41 UTC (permalink / raw)
  To: Alexander Kolesen
  Cc: jfs-discussion, Jens Axboe, rjw, viro, linux-fsdevel,
	Christoph Hellwig

On Mon, 2010-11-22 at 23:20 +0200, Alexander Kolesen wrote:

> > I haven't dug too far into this yet, but I suspect that
> > aaead25b954879e1a708ff2f3602f494c18d20b5 is related.
> > 
> > commit aaead25b954879e1a708ff2f3602f494c18d20b5
> > Author: Christoph Hellwig <hch@lst.de>
> > Date:   Mon Oct 4 14:25:33 2010 +0200
> > 
> >     writeback: always use sb->s_bdi for writeback purposes
> > 
> > I'll have to look at what happens when a device is unplugged to see if
> > JFS is missing something, or it's more of a generic problem.  I'm open
> > to suggestions from anybody on cc.
> > 
> > Thanks,
> > Shaggy
> >
> 
> I've tested kernel before and after this commit. Yes, it reproduced after,
> and didn't reproduced before.

I recreated the problem on ext3 as well, so it's not specific to JFS.

I see three potential ways to fix this.

1. bdi_prune_sb() could set sb->s_bdi to &default_backing_dev_info
rather than NULL
2. inode_to_bdi() could return &default_backing_dev_info (or
inode->i_mapping->backing_dev_info) if sb->s_bdi is NULL.
3. the callers of inode_to_bdi() could check for s_bdi being NULL and
exit gracefully.

It seems that Jens and Christoph have ideas about cleaning up the bdi
stuff, so this may be a short-term fix.

Here's a patch for option 2.
  ---------------------------------
fs: avoid null pointer dereference when a block device is unplugged

Physically unplugging a block device when a file system is mounted can
result in sb->s_bdi being set to NULL.  The callers of inode_to_bdi()
expect a non-NULL pointer.  Return &default_backing_dev_info instead
of NULL.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
 fs/fs-writeback.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 3d06ccc..5f8cc5d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -76,7 +76,7 @@ static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
 	if (strcmp(sb->s_type->name, "bdev") == 0)
 		return inode->i_mapping->backing_dev_info;
 
-	return sb->s_bdi;
+	return sb->s_bdi ? sb->s_bdi : &default_backing_dev_info;
 }
 
 static inline struct inode *wb_inode(struct list_head *head)


-- 
Dave Kleikamp
IBM Linux Technology Center



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Jfs-discussion] Kernel oops when accessing to mounted, but unplugged JFS
  2010-11-23  3:41       ` Dave Kleikamp
@ 2010-11-23  8:37         ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2010-11-23  8:37 UTC (permalink / raw)
  To: Dave Kleikamp
  Cc: Alexander Kolesen, jfs-discussion, Jens Axboe, rjw, viro,
	linux-fsdevel, Christoph Hellwig

On Mon, Nov 22, 2010 at 09:41:13PM -0600, Dave Kleikamp wrote:
> I see three potential ways to fix this.
> 
> 1. bdi_prune_sb() could set sb->s_bdi to &default_backing_dev_info
> rather than NULL
> 2. inode_to_bdi() could return &default_backing_dev_info (or
> inode->i_mapping->backing_dev_info) if sb->s_bdi is NULL.
> 3. the callers of inode_to_bdi() could check for s_bdi being NULL and
> exit gracefully.
> 
> It seems that Jens and Christoph have ideas about cleaning up the bdi
> stuff, so this may be a short-term fix.

It's a mess.  The correct fix is to never unregister a bdi that still
has a life filesystem on it.  This seems to be solved by plain removing
the unlink_gendisk call in unlink_gendisk.  Unlink_gendisk just removes
the gendisk from visibility, but it still lives on as long as we have
references to it.  We already have a bdi_destroy call in
blk_release_queue that should unregister the BDI once it's reference
count finally reaches zero.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-11-23  8:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20101121122008.GE4024__36829.3162588545$1290342860$gmane$org@localhost>
2010-11-21 17:57 ` Kernel oops when accessing to mounted, but unplugged JFS Andi Kleen
2010-11-22 16:22   ` [Jfs-discussion] " Dave Kleikamp
2010-11-22 21:20     ` Alexander Kolesen
2010-11-23  3:41       ` Dave Kleikamp
2010-11-23  8:37         ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).