gfs2 deadlock (was Re: Found it)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Simon Kirby <sim@hostway.ca>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Waiman Long <Waiman.Long@hp.com>,
	Ian Applegate <ia@cloudflare.com>,
	Christoph Lameter <cl@gentwo.org>,
	Pekka Enberg <penberg@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Chris Mason <chris.mason@fusionio.com>,
	Steven Whitehouse <swhiteho@redhat.com>
Subject: gfs2 deadlock (was Re: Found it)
Date: Thu, 5 Dec 2013 08:12:01 +0000	[thread overview]
Message-ID: <20131205081201.GA17203@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20131203042830.GI10323@ZenIV.linux.org.uk>

On Tue, Dec 03, 2013 at 04:28:30AM +0000, Al Viro wrote:

> These should be safe, but damnit, we really need the lifecycle documented for
> all objects - the above is only a part of it (note that for e.g. superblocks
> we have additional rules re "->s_active can't be incremented for any reason
> once it drops to zero, it can't be incremented until superblock had been
> marked 'born' and it crosses over to zero only with ->s_umount held"; there's
> 6 stages in life cycle of struct super_block and we had interesting bugs due
> to messing the transitions up).  The trouble is, attempt to write those down
> tends to stray into massive grep session, with usual results - some other
> crap gets found (e.g. in some odd driver) and needs to be dealt with ;-/
> Sigh...

... and sure enough, this time is no different - gfs2 sysfs-related code
cheerfully violates lifetime rules for superblocks, which would've
caused a major mess later, if it had not immediately caused a deadlock
on the same superblock ;-/

Watch: gfs2 creates a bunch of files in sysfs (/sys/fs/gfs2/<devname>/*).
Said bunch gets removed from ->put_super().  Which is called under
->s_umount.  Guess what happens if somebody tries to write "1" to
/sys/fs/gfs2/.../freeze just as we enter that ->put_super() (or at any
point starting from the moment when deactivate_locked_super() has dropped
the last active reference)?  This:
static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
{
        int error;
        int n = simple_strtol(buf, NULL, 0);

        if (!capable(CAP_SYS_ADMIN))
                return -EPERM;

        switch (n) {
[snip]
        case 1:
                error = freeze_super(sdp->sd_vfs);

And freeze_super(sb) assumes that caller has an active reference to
sb:
int freeze_super(struct super_block *sb)
{
        int ret;

        atomic_inc(&sb->s_active);

... which is not legitimate when ->s_active has already reached zero.
And right after that we hit this:
        down_write(&sb->s_umount);

Voila - write(2) is waiting for ->s_umount, while umount(2) is holding
->s_umount and waits for write(2) to get past freeze_store().

Hell knows what to do here - atomic_inc_not_zero() in freeze_super()
(and failing if it fails) would've worked, but it doesn't help with
the deadlock - just write "0" instead and we hit thaw_super(), which
starts with grabbing ->s_umount.  atomic_inc_not_zero()/deactivate_super()
around that call of thaw_super() would probably work, but I'll need to
look at that after I get some sleep...

Why bother with sysfs, anyway?  What's wrong with putting those same files
on gfs2meta, seeing that _this_ would have no problems with object lifetimes?
Too late by now, of course, but...

next prev parent reply	other threads:[~2013-12-05  8:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02 16:00 Found it! (was Re: [3.10] Oopses in kmem_cache_allocate() via prepare_creds()) Linus Torvalds
2013-12-02 16:27 ` Ingo Molnar
2013-12-02 16:46   ` Al Viro
2013-12-02 17:05     ` Ingo Molnar
2013-12-02 17:06 ` Al Viro
2013-12-03  2:58 ` Linus Torvalds
2013-12-03  4:28   ` Al Viro
2013-12-05  8:12     ` Al Viro [this message]
2013-12-05 10:19       ` gfs2 deadlock (was Re: Found it) Steven Whitehouse
2013-12-03  8:52   ` [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races Ingo Molnar
2013-12-03 18:10     ` Linus Torvalds
2013-12-04  9:19       ` Simon Kirby
2013-12-04 21:14         ` Linus Torvalds
2013-12-05  8:06           ` Simon Kirby
2013-12-05  6:57     ` Simon Kirby
2013-12-11 15:03     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131205081201.GA17203@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=Waiman.Long@hp.com \
    --cc=chris.mason@fusionio.com \
    --cc=cl@gentwo.org \
    --cc=ia@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sim@hostway.ca \
    --cc=swhiteho@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox