public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* XFS oops under 2.6.23.9
@ 2008-01-23  4:30 Jonathan Woithe
  2008-01-23  5:34 ` do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9) David Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Woithe @ 2008-01-23  4:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jonathan Woithe

Last night my laptop suffered an oops during closedown.  The full oops
reports can be downloaded from

  http://www.atrad.com.au/~jwoithe/xfs_oops/

as photos of the screen.  Since the laptop was unusable at this point I
wasn't able to cut and paste the details, and they weren't in the logs when
the machine was rebooted.

The initial complaint claims to be an "invalid opcode".  Is this possibly a
memory fault developing or does it ring any bells for anyone?  memtest86
finds no fault with the memory.

Kernel version was kernel.org 2.6.23.9 compiled as a low latency desktop. 
The RT patches were not applied.

Regards
  jonathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9)
  2008-01-23  4:30 XFS oops under 2.6.23.9 Jonathan Woithe
@ 2008-01-23  5:34 ` David Chinner
  2008-01-23  5:54   ` Jonathan Woithe
  2008-01-23 12:55   ` Christoph Hellwig
  0 siblings, 2 replies; 5+ messages in thread
From: David Chinner @ 2008-01-23  5:34 UTC (permalink / raw)
  To: Jonathan Woithe; +Cc: linux-kernel

On Wed, Jan 23, 2008 at 03:00:48PM +1030, Jonathan Woithe wrote:
> Last night my laptop suffered an oops during closedown.  The full oops
> reports can be downloaded from
> 
>   http://www.atrad.com.au/~jwoithe/xfs_oops/

Assertion failed: atomic_read(&mp->m_active_trans) == 0, file:
fs/xfs/xfs_vfsops.c, line 689.

The remount read-only of the root drive supposedly completed
while there was still active modification of the filesystem
taking place.

> Kernel version was kernel.org 2.6.23.9 compiled as a low latency desktop. 

The patch in 2.6.23 that introduced this check was:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=516b2e7c2661615ba5d5ad9fb584f068363502d3

Basically, the remount-readonly path was not flushing things
properly, so we changed it to flushing things properly and ensure we
got bug reports if it wasn't. Yours is the second report of not
shutting down correctly since this change went in (we've seen it
once in ~8 months in a QA environment).

I've had suspicions of a race in the remount-ro code in
do_remount_sb() w.r.t to the fs_may_remount_ro() check.  That is, we
do an unlocked check to see if we can remount readonly and then fail
to check again once we've locked the superblock out and start the
remount.

The read only flag only gets set *after* we've made the filesystem
readonly, which means before we are truly read only, we can race
with other threads opening files read/write or filesystem
modifcations can take place.

The result of that race (if it is really unsafe) will be assert you
see. The patch I wrote a couple of months ago to fix the problem
is attached below....

Cheers,

Dave.

---

Set the MS_RDONLY before we check to see if we can remount
read only so that we close a race between checking remount
is ok and setting the superblock flag that allows other
processes to start modifying the filesystem while it is
being remounted.

Signed-off-by: Dave Chinner <dgc@sgi.com>
---
 fs/xfs/linux-2.6/xfs_super.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c	2008-01-22 14:57:07.753782292 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c	2008-01-23 16:22:16.940279351 +1100
@@ -1222,6 +1222,22 @@ xfs_fs_remount(
 	struct xfs_mount_args	*args = xfs_args_allocate(sb, 0);
 	int			error;
 
+	/*
+	 * We need to have the MS_RDONLY flag set on the filesystem before we
+	 * try to quiesce it down to a sane state. If we don't set the
+	 * MS_RDONLY before we check the fs_may_remount_ro(sb) state, we have a
+	 * race where write operations can start after we've checked it is OK
+	 * to remount read only. This results in assert failures due to being
+	 * unable to quiesce the transaction subsystem correctly.
+	 */
+	if (!(sb->s_flags & MS_RDONLY) && (*flags & MS_RDONLY)) {
+		sb->s_flags |= MS_RDONLY;
+		if (!fs_may_remount_ro(sb)) {
+			sb->s_flags &= ~MS_RDONLY;
+			return -EBUSY;
+		}
+	}
+
 	error = xfs_parseargs(mp, options, args, 1);
 	if (!error)
 		error = xfs_mntupdate(mp, flags, args);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9)
  2008-01-23  5:34 ` do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9) David Chinner
@ 2008-01-23  5:54   ` Jonathan Woithe
  2008-01-23  6:11     ` David Chinner
  2008-01-23 12:55   ` Christoph Hellwig
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Woithe @ 2008-01-23  5:54 UTC (permalink / raw)
  To: David Chinner; +Cc: Jonathan Woithe, linux-kernel

Hi Dave

> On Wed, Jan 23, 2008 at 03:00:48PM +1030, Jonathan Woithe wrote:
> > Last night my laptop suffered an oops during closedown.  The full oops
> > reports can be downloaded from
> > 
> >   http://www.atrad.com.au/~jwoithe/xfs_oops/
> 
> Assertion failed: atomic_read(&mp->m_active_trans) == 0, file:
> fs/xfs/xfs_vfsops.c, line 689.
> 
> The remount read-only of the root drive supposedly completed
> while there was still active modification of the filesystem
> taking place.
> 
> > Kernel version was kernel.org 2.6.23.9 compiled as a low latency desktop. 
> 
> The patch in 2.6.23 that introduced this check was:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=516b2e7c2661615ba5d5ad9fb584f068363502d3
> 
> Basically, the remount-readonly path was not flushing things
> properly, so we changed it to flushing things properly and ensure we
> got bug reports if it wasn't. Yours is the second report of not
> shutting down correctly since this change went in (we've seen it
> once in ~8 months in a QA environment).
> 
> I've had suspicions of a race in the remount-ro code in
> do_remount_sb() w.r.t to the fs_may_remount_ro() check.  That is, we
> do an unlocked check to see if we can remount readonly and then fail
> to check again once we've locked the superblock out and start the
> remount.
> 
> The read only flag only gets set *after* we've made the filesystem
> readonly, which means before we are truly read only, we can race
> with other threads opening files read/write or filesystem
> modifcations can take place.
> 
> The result of that race (if it is really unsafe) will be assert you
> see. The patch I wrote a couple of months ago to fix the problem
> is attached below....

Thanks for the patch.  I will apply it and see what happens.

Will this be in 2.6.24?

Regards
  jonathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9)
  2008-01-23  5:54   ` Jonathan Woithe
@ 2008-01-23  6:11     ` David Chinner
  0 siblings, 0 replies; 5+ messages in thread
From: David Chinner @ 2008-01-23  6:11 UTC (permalink / raw)
  To: Jonathan Woithe; +Cc: David Chinner, linux-kernel

On Wed, Jan 23, 2008 at 04:24:33PM +1030, Jonathan Woithe wrote:
> > On Wed, Jan 23, 2008 at 03:00:48PM +1030, Jonathan Woithe wrote:
> > > Last night my laptop suffered an oops during closedown.  The full oops
> > > reports can be downloaded from
> > > 
> > >   http://www.atrad.com.au/~jwoithe/xfs_oops/
> > 
> > Assertion failed: atomic_read(&mp->m_active_trans) == 0, file:
> > fs/xfs/xfs_vfsops.c, line 689.
> > 
> > The remount read-only of the root drive supposedly completed
> > while there was still active modification of the filesystem
> > taking place.
.....
> > The read only flag only gets set *after* we've made the filesystem
> > readonly, which means before we are truly read only, we can race
> > with other threads opening files read/write or filesystem
> > modifcations can take place.
> > 
> > The result of that race (if it is really unsafe) will be assert you
> > see. The patch I wrote a couple of months ago to fix the problem
> > is attached below....
> 
> Thanks for the patch.  I will apply it and see what happens.
> 
> Will this be in 2.6.24?

No - because hitting the problem is so rare that I'm not even
sure it's a problem. One of the VFS gurus will need to comment
on whether this really is a problem, and if so the correct fix
is to do_remount_sb() so that it closes the hole for everyone.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9)
  2008-01-23  5:34 ` do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9) David Chinner
  2008-01-23  5:54   ` Jonathan Woithe
@ 2008-01-23 12:55   ` Christoph Hellwig
  1 sibling, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2008-01-23 12:55 UTC (permalink / raw)
  To: David Chinner; +Cc: Jonathan Woithe, linux-kernel

On Wed, Jan 23, 2008 at 04:34:12PM +1100, David Chinner wrote:
> I've had suspicions of a race in the remount-ro code in
> do_remount_sb() w.r.t to the fs_may_remount_ro() check.  That is, we
> do an unlocked check to see if we can remount readonly and then fail
> to check again once we've locked the superblock out and start the
> remount.

This code is inherently racy, and will be replaced with a proper
per-vfsmount writer count in 2.6.25 as part of the per-mount r/o patches.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-01-23 12:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-23  4:30 XFS oops under 2.6.23.9 Jonathan Woithe
2008-01-23  5:34 ` do_remount_sb(RDONLY) race? (was: XFS oops under 2.6.23.9) David Chinner
2008-01-23  5:54   ` Jonathan Woithe
2008-01-23  6:11     ` David Chinner
2008-01-23 12:55   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox