All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>
Subject: [ 19/23] livelock avoidance in sget()
Date: Fri,  2 Aug 2013 18:23:32 +0800	[thread overview]
Message-ID: <20130802102038.884057521@linuxfoundation.org> (raw)
In-Reply-To: <20130802102036.180660415@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit acfec9a5a892f98461f52ed5770de99a3e571ae2 upstream.

Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail.  The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1.  Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount.  ->s_active is 3 now.  Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0.  ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock.  And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.

The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN.  Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down.  Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.

The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten.  The right way to deal with that is obviously to fix sget()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/super.c |   25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

--- a/fs/super.c
+++ b/fs/super.c
@@ -222,19 +222,19 @@ EXPORT_SYMBOL(deactivate_super);
  *	and want to turn it into a full-blown active reference.  grab_super()
  *	is called with sb_lock held and drops it.  Returns 1 in case of
  *	success, 0 if we had failed (superblock contents was already dead or
- *	dying when grab_super() had been called).
+ *	dying when grab_super() had been called).  Note that this is only
+ *	called for superblocks not in rundown mode (== ones still on ->fs_supers
+ *	of their type), so increment of ->s_count is OK here.
  */
 static int grab_super(struct super_block *s) __releases(sb_lock)
 {
-	if (atomic_inc_not_zero(&s->s_active)) {
-		spin_unlock(&sb_lock);
-		return 1;
-	}
-	/* it's going away */
 	s->s_count++;
 	spin_unlock(&sb_lock);
-	/* wait for it to die */
 	down_write(&s->s_umount);
+	if ((s->s_flags & MS_BORN) && atomic_inc_not_zero(&s->s_active)) {
+		put_super(s);
+		return 1;
+	}
 	up_write(&s->s_umount);
 	put_super(s);
 	return 0;
@@ -335,11 +335,6 @@ retry:
 				destroy_super(s);
 				s = NULL;
 			}
-			down_write(&old->s_umount);
-			if (unlikely(!(old->s_flags & MS_BORN))) {
-				deactivate_locked_super(old);
-				goto retry;
-			}
 			return old;
 		}
 	}
@@ -512,10 +507,10 @@ restart:
 		if (list_empty(&sb->s_instances))
 			continue;
 		if (sb->s_bdev == bdev) {
-			if (grab_super(sb)) /* drops sb_lock */
-				return sb;
-			else
+			if (!grab_super(sb))
 				goto restart;
+			up_write(&sb->s_umount);
+			return sb;
 		}
 	}
 	spin_unlock(&sb_lock);



  parent reply	other threads:[~2013-08-02 10:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-02 10:23 [ 00/23] 3.0.89-stable review Greg Kroah-Hartman
2013-08-02 10:23 ` [ 01/23] USB: storage: Add MicroVault Flash Drive to unusual_devs Greg Kroah-Hartman
2013-08-02 10:23 ` [ 02/23] ASoC: max98088 - fix element type of the register cache Greg Kroah-Hartman
2013-08-02 10:23 ` [ 03/23] SCSI: sd: fix crash when UA received on DIF enabled device Greg Kroah-Hartman
2013-08-02 10:23 ` [ 04/23] SCSI: qla2xxx: Properly set the tagging for commands Greg Kroah-Hartman
2013-08-14 16:40   ` Jack Hill
2013-08-14 17:04     ` Greg Kroah-Hartman
2013-08-14 17:31       ` Jack Hill
2013-08-02 10:23 ` [ 05/23] tracing: Fix irqs-off tag display in syscall tracing Greg Kroah-Hartman
2013-08-02 10:23 ` [ 06/23] xhci: fix null pointer dereference on ring_doorbell_for_active_rings Greg Kroah-Hartman
2013-08-02 10:23 ` [ 07/23] xhci: Avoid NULL pointer deref when host dies Greg Kroah-Hartman
2013-08-02 10:23 ` [ 08/23] USB: ti_usb_3410_5052: fix dynamic-id matching Greg Kroah-Hartman
2013-08-02 10:23 ` [ 09/23] USB: misc: Add Manhattan Hi-Speed USB DVI Converter to sisusbvga Greg Kroah-Hartman
2013-08-02 10:23 ` [ 10/23] usb: Clear both buffers when clearing a control transfer TT buffer Greg Kroah-Hartman
2013-08-02 10:23 ` [ 11/23] staging: comedi: COMEDI_CANCEL ioctl should wake up read/write Greg Kroah-Hartman
2013-08-02 10:23 ` [ 12/23] libata: make it clear that sata_inic162x is experimental Greg Kroah-Hartman
2013-08-02 10:23 ` [ 13/23] powerpc/modules: Module CRC relocation fix causes perf issues Greg Kroah-Hartman
2013-08-02 10:23 ` [ 14/23] ACPI / memhotplug: Fix a stale pointer in error path Greg Kroah-Hartman
2013-08-02 10:23 ` [ 15/23] drm/radeon: fix combios tables on older cards Greg Kroah-Hartman
2013-08-02 10:23 ` [ 16/23] drm/radeon: improve dac adjust heuristics for legacy pdac Greg Kroah-Hartman
2013-08-02 10:23 ` [ 17/23] drm/radeon/atom: initialize more atom interpretor elements to 0 Greg Kroah-Hartman
2013-08-02 10:23 ` [ 18/23] USB: serial: ftdi_sio: add more RT Systems ftdi devices Greg Kroah-Hartman
2013-08-02 10:23 ` Greg Kroah-Hartman [this message]
2013-08-02 10:23 ` [ 20/23] xen/evtchn: avoid a deadlock when unbinding an event channel Greg Kroah-Hartman
2013-08-02 10:23 ` [ 21/23] virtio: support unlocked queue poll Greg Kroah-Hartman
2013-08-02 10:23 ` [ 22/23] virtio_net: fix race in RX VQ processing Greg Kroah-Hartman
2013-08-02 10:23 ` [ 23/23] mm/memory-hotplug: fix lowmem count overflow when offline pages Greg Kroah-Hartman
2013-08-02 19:59 ` [ 00/23] 3.0.89-stable review Shuah Khan
2013-08-02 21:28 ` Guenter Roeck
2013-08-02 22:36   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130802102038.884057521@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.