From: Jan Kara <jack@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
Dan Williams <dan.j.williams@intel.com>,
Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>,
NeilBrown <neilb@suse.de>
Subject: Re: [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list()
Date: Mon, 20 Feb 2017 17:58:11 +0100 [thread overview]
Message-ID: <20170220165811.GD2315@quack2.suse.cz> (raw)
In-Reply-To: <20170212044027.GF29323@mtj.duckdns.org>
On Sun 12-02-17 13:40:27, Tejun Heo wrote:
> Hello, Jan.
>
> On Thu, Feb 09, 2017 at 01:44:31PM +0100, Jan Kara wrote:
> > When block device is closed, we call inode_detach_wb() in __blkdev_put()
> > which sets inode->i_wb to NULL. That is contrary to expectations that
> > inode->i_wb stays valid once set during the whole inode's lifetime and
> > leads to oops in wb_get() in locked_inode_to_wb_and_lock_list() because
> > inode_to_wb() returned NULL.
> >
> > The reason why we called inode_detach_wb() is not valid anymore though.
> > BDI is guaranteed to stay along until we call bdi_put() from
> > bdev_evict_inode() so we can postpone calling inode_detach_wb() to that
> > moment. A complication is that i_wb can point to non-root wb_writeback
> > structure and in that case we do need to clean it up as bdi_unregister()
> > blocks waiting for all non-root wb_writeback references to get dropped.
> > Thus this i_wb reference could block device removal e.g. from
> > __scsi_remove_device() (which indirectly ends up calling
> > bdi_unregister()). We cannot rely on block device inode to go away soon
> > (and thus i_wb reference to get dropped) as the device may got
> > hot-removed e.g. under a mounted filesystem. We deal with these issues
> > by switching block device inode from non-root wb_writeback structure to
> > bdi->wb when needed. Since this is rather expensive (requires
> > synchronize_rcu()) we do the switching only in del_gendisk() when we
> > know the device is going away.
>
> So, the only reason cgwb_bdi_destroy() is synchronous is because bdi
> destruction was synchronous. Now that bdi is properly reference
> counted and can be decoupled from gendisk / q destruction, I can't
> think of a reason to keep cgwb destruction synchronous. Switching
> wb's on destruction is kinda clumsy and it almost always hurts to
> expose synchronize_rcu() in userland visible paths.
>
> Wouldn't something like the following work?
>
> * Remove bdi->usage_cnt and the synchronous waiting in
> cgwb_bdi_destroy().
>
> * Instead, make cgwb's hold bdi->refcnt and put it from
> cgwb_release_workfn().
>
> Then, we don't have to switch during shutdown and can just let things
> drain.
At first sight this looks workable and would mean less special code so I
like it. I'll experiment with it and see how it works out.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-02-20 16:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-09 12:44 [PATCH 0/10] block: Fix block device shutdown related races Jan Kara
2017-02-09 12:44 ` [PATCH 01/10] block: Move bdev_unhash_inode() after invalidate_partition() Jan Kara
2017-02-12 3:58 ` Tejun Heo
2017-02-20 14:53 ` Jan Kara
2017-02-09 12:44 ` [PATCH 02/10] block: Unhash also block device inode for the whole device Jan Kara
2017-02-12 4:16 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 03/10] block: Revalidate i_bdev reference in bd_aquire() Jan Kara
2017-02-09 15:54 ` Jan Kara
2017-02-12 4:22 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 04/10] block: Move bdi_unregister() to del_gendisk() Jan Kara
2017-02-10 2:21 ` NeilBrown
2017-02-12 4:31 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 05/10] writeback: Generalize and standardize I_SYNC waiting function Jan Kara
2017-02-12 4:32 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 06/10] writeback: Move __inode_wait_for_state_bit Jan Kara
2017-02-09 12:44 ` [PATCH 07/10] writeback: Implement reliable switching to default writeback structure Jan Kara
2017-02-10 2:19 ` NeilBrown
2017-02-10 13:20 ` Jan Kara
2017-02-09 12:44 ` [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list() Jan Kara
2017-02-12 4:40 ` Tejun Heo
2017-02-20 16:58 ` Jan Kara [this message]
2017-02-09 12:44 ` [PATCH 09/10] kobject: Export kobject_get_unless_zero() Jan Kara
2017-02-12 4:41 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 10/10] block: Fix oops scsi_disk_get() Jan Kara
2017-02-12 4:43 ` Tejun Heo
2017-02-09 14:52 ` [PATCH 0/10] block: Fix block device shutdown related races Thiago Jung Bauermann
2017-02-09 15:48 ` Jan Kara
2017-02-13 14:27 ` Thiago Jung Bauermann
-- strict thread matches above, loose matches on Subject: below --
2017-03-23 0:36 [PATCH 0/10 v5] " Jan Kara
2017-03-23 0:37 ` [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list() Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170220165811.GD2315@quack2.suse.cz \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=bauerman@linux.vnet.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=neilb@suse.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox