From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Simon Kirby <sim@hostway.ca>, Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Waiman Long <Waiman.Long@hp.com>,
Ian Applegate <ia@cloudflare.com>,
Christoph Lameter <cl@gentwo.org>,
Pekka Enberg <penberg@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Chris Mason <chris.mason@fusionio.com>,
Steven Whitehouse <swhiteho@redhat.com>
Subject: gfs2 deadlock (was Re: Found it)
Date: Thu, 5 Dec 2013 08:12:01 +0000 [thread overview]
Message-ID: <20131205081201.GA17203@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20131203042830.GI10323@ZenIV.linux.org.uk>
On Tue, Dec 03, 2013 at 04:28:30AM +0000, Al Viro wrote:
> These should be safe, but damnit, we really need the lifecycle documented for
> all objects - the above is only a part of it (note that for e.g. superblocks
> we have additional rules re "->s_active can't be incremented for any reason
> once it drops to zero, it can't be incremented until superblock had been
> marked 'born' and it crosses over to zero only with ->s_umount held"; there's
> 6 stages in life cycle of struct super_block and we had interesting bugs due
> to messing the transitions up). The trouble is, attempt to write those down
> tends to stray into massive grep session, with usual results - some other
> crap gets found (e.g. in some odd driver) and needs to be dealt with ;-/
> Sigh...
... and sure enough, this time is no different - gfs2 sysfs-related code
cheerfully violates lifetime rules for superblocks, which would've
caused a major mess later, if it had not immediately caused a deadlock
on the same superblock ;-/
Watch: gfs2 creates a bunch of files in sysfs (/sys/fs/gfs2/<devname>/*).
Said bunch gets removed from ->put_super(). Which is called under
->s_umount. Guess what happens if somebody tries to write "1" to
/sys/fs/gfs2/.../freeze just as we enter that ->put_super() (or at any
point starting from the moment when deactivate_locked_super() has dropped
the last active reference)? This:
static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
{
int error;
int n = simple_strtol(buf, NULL, 0);
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
switch (n) {
[snip]
case 1:
error = freeze_super(sdp->sd_vfs);
And freeze_super(sb) assumes that caller has an active reference to
sb:
int freeze_super(struct super_block *sb)
{
int ret;
atomic_inc(&sb->s_active);
... which is not legitimate when ->s_active has already reached zero.
And right after that we hit this:
down_write(&sb->s_umount);
Voila - write(2) is waiting for ->s_umount, while umount(2) is holding
->s_umount and waits for write(2) to get past freeze_store().
Hell knows what to do here - atomic_inc_not_zero() in freeze_super()
(and failing if it fails) would've worked, but it doesn't help with
the deadlock - just write "0" instead and we hit thaw_super(), which
starts with grabbing ->s_umount. atomic_inc_not_zero()/deactivate_super()
around that call of thaw_super() would probably work, but I'll need to
look at that after I get some sleep...
Why bother with sysfs, anyway? What's wrong with putting those same files
on gfs2meta, seeing that _this_ would have no problems with object lifetimes?
Too late by now, of course, but...
next prev parent reply other threads:[~2013-12-05 8:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-02 16:00 Found it! (was Re: [3.10] Oopses in kmem_cache_allocate() via prepare_creds()) Linus Torvalds
2013-12-02 16:27 ` Ingo Molnar
2013-12-02 16:46 ` Al Viro
2013-12-02 17:05 ` Ingo Molnar
2013-12-02 17:06 ` Al Viro
2013-12-03 2:58 ` Linus Torvalds
2013-12-03 4:28 ` Al Viro
2013-12-05 8:12 ` Al Viro [this message]
2013-12-05 10:19 ` gfs2 deadlock (was Re: Found it) Steven Whitehouse
2013-12-03 8:52 ` [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races Ingo Molnar
2013-12-03 18:10 ` Linus Torvalds
2013-12-04 9:19 ` Simon Kirby
2013-12-04 21:14 ` Linus Torvalds
2013-12-05 8:06 ` Simon Kirby
2013-12-05 6:57 ` Simon Kirby
2013-12-11 15:03 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131205081201.GA17203@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=Waiman.Long@hp.com \
--cc=chris.mason@fusionio.com \
--cc=cl@gentwo.org \
--cc=ia@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=penberg@kernel.org \
--cc=peterz@infradead.org \
--cc=sim@hostway.ca \
--cc=swhiteho@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox