From: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
Tahsin Erdogan <tahsin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>,
Nauman Rafique <nauman-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Jan Kara <jack-IBi9RG/b67k@public.gmane.org>
Subject: Re: [PATCH block/for-4.5-fixes] writeback: keep superblock pinned during cgroup writeback association switches
Date: Fri, 19 Feb 2016 21:58:11 +0000 [thread overview]
Message-ID: <20160219215811.GA17997@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20160219205147.GN13177-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
On Fri, Feb 19, 2016 at 03:51:47PM -0500, Tejun Heo wrote:
> I see, I suppose that's what distinguishes s_active and s_umount
> usages - whether pinning should block umounting?
???
->s_active is plain and simple count of "I hold a long-term reference to
this superblock, don't you shut it down until I drop that".
->s_umount is held across some of the transitions in struct super_block
life cycle, including the actual process of shutdown.
> > If you need details on s_active/s_umount/etc., I can give you a braindump,
> > but I suspect your real question is a lot more specific. Details, please...
>
> So, the problem is that cgroup writeback path sometimes schedules a
> work item to change the cgroup an inode is associated. Currently,
> only the inode was pinned and the underlying sb may go away while the
> work item is still pending. The work item performs iput() at the end
> and that explodes if the underlying sb is already gone.
>
> As writeback path relies on s_umount for synchronization anyway, I
> think that'd be the most natural way to hold onto the sb but
> unfortunately there's no way to pass on the down_read to the async
> execution context, so I made it grap s_active, which worked fine but
> it made the sb hang around until such work items are finished. It's
> an unlikely race to hit but still broken.
>
> The last option would be canceling / flushing these work items from sb
> shutdown path which is likely more involved.
>
> What should it be doing?
Um... What ordering requirements do you have? You obviously shouldn't
let it continue past the shutdown - as the matter of fact, you *can't* let
it continue past generic_shutdown_super(), since any inode references
held at evict_inodes() time will make it very unhappy. Attempts to do
any IO after that will make things a lot worse than unhappy - data structures
needed to do it might be gone (and if you hold a bit longer, filesystem
driver itself might very well be gone, along with the functions you were
going to call).
Grabbing ->s_active is a seriously bad idea for another reason - in
a situation when there's only one mount of given fs, plain umount() should
_not_ return 0 before fs shutdown is over. Sure, it is possible that there's
a binding somewhere, or that it's a lazy umount, etc., but those are "you've
asked for it" situations; having plain umount of e.g. ext3 on a USB stick
return success before it is safe to pull that stick out is a Bloody Bad Idea,
for obvious usability reasons.
IOW, while fs shutdown may be async, making it *always* async would be a bad
bug. And bumping ->s_active does just that.
I'd go for trylock inside that work + making generic_shutdown_super()
kill all such works. I assume that it *can* be abandoned in situation
when we know that sync_filesystem() is about to be called and that
said sync_filesystem() won't, in turn, schedule any such works, of course...
next prev parent reply other threads:[~2016-02-19 21:58 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-12 19:32 [BUG] cgroup writeback crash Tahsin Erdogan
[not found] ` <CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-02-15 21:00 ` Tejun Heo
[not found] ` <20160215210047.GN3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-16 7:56 ` Tahsin Erdogan
[not found] ` <CAAeU0aNAd1Ra6LXmWwq8row4MD_BpVHiSXOwHx07m86UWREvHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-02-16 18:24 ` [PATCH block/for-4.5-fixes] writeback: keep superblock pinned during cgroup writeback association switches Tejun Heo
[not found] ` <20160216182457.GO3741-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-02-16 18:34 ` Jens Axboe
2016-02-17 20:57 ` Jan Kara
[not found] ` <20160217205721.GE14140-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2016-02-17 21:07 ` Tejun Heo
2016-02-17 22:30 ` Jan Kara
2016-02-17 22:41 ` Tahsin Erdogan
[not found] ` <CAAeU0aOvSwPbLPU0=20D1RExNj8VsbB38hUnyso2L8xNSQC0XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-02-17 23:02 ` Tejun Heo
2016-02-18 9:55 ` Jan Kara
[not found] ` <20160218095538.GA4338-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2016-02-18 13:00 ` Tejun Heo
2016-02-18 13:20 ` Jan Kara
2016-02-19 20:18 ` Al Viro
2016-02-19 20:51 ` Tejun Heo
[not found] ` <20160219205147.GN13177-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-02-19 21:58 ` Al Viro [this message]
[not found] ` <20160219215811.GA17997-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2016-02-19 22:15 ` Tejun Heo
2016-02-19 22:26 ` Al Viro
[not found] ` <20160219222609.GC17997-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2016-02-28 21:53 ` Tejun Heo
[not found] ` <20160217230231.GC6479-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-02-29 20:47 ` [PATCH block/for-linus] writeback: flush inode cgroup wb switches instead of pinning super_block Tejun Heo
[not found] ` <20160229204724.GV3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-29 20:54 ` Al Viro
[not found] ` <20160229205428.GB17997-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2016-02-29 20:58 ` Tejun Heo
[not found] ` <20160229205837.GX3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-29 21:06 ` Al Viro
[not found] ` <20160229210614.GC17997-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2016-02-29 21:08 ` Tejun Heo
[not found] ` <20160229210800.GY3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-29 21:21 ` Jan Kara
2016-02-29 23:28 ` [PATCH v2 " Tejun Heo
2016-03-01 9:20 ` Jan Kara
2016-03-01 17:46 ` Jens Axboe
[not found] ` <56D5D592.2020800-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2016-03-01 17:50 ` Tejun Heo
[not found] ` <CAOS58YO5vTBnM561np7gpXKGQELrT169bYqmcfvAvsquBJK5yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-02 10:29 ` Jan Kara
2016-03-01 13:39 ` [PATCH " Tahsin Erdogan
2016-02-18 10:12 ` [PATCH block/for-4.5-fixes] writeback: keep superblock pinned during cgroup writeback association switches Nikolay Borisov
2016-02-18 12:57 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160219215811.GA17997@ZenIV.linux.org.uk \
--to=viro-3bdd1+5odreifsdqtta3olvcufugdwfn@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=jack-AlSwsSmVLrQ@public.gmane.org \
--cc=jack-IBi9RG/b67k@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=nauman-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=tahsin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=tytso-3s7WtUTddSA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).