From: Jan Kara <jack@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
Dan Williams <dan.j.williams@intel.com>,
Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>,
NeilBrown <neilb@suse.de>
Subject: Re: [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list()
Date: Mon, 20 Feb 2017 17:58:11 +0100 [thread overview]
Message-ID: <20170220165811.GD2315@quack2.suse.cz> (raw)
In-Reply-To: <20170212044027.GF29323@mtj.duckdns.org>
On Sun 12-02-17 13:40:27, Tejun Heo wrote:
> Hello, Jan.
>
> On Thu, Feb 09, 2017 at 01:44:31PM +0100, Jan Kara wrote:
> > When block device is closed, we call inode_detach_wb() in __blkdev_put()
> > which sets inode->i_wb to NULL. That is contrary to expectations that
> > inode->i_wb stays valid once set during the whole inode's lifetime and
> > leads to oops in wb_get() in locked_inode_to_wb_and_lock_list() because
> > inode_to_wb() returned NULL.
> >
> > The reason why we called inode_detach_wb() is not valid anymore though.
> > BDI is guaranteed to stay along until we call bdi_put() from
> > bdev_evict_inode() so we can postpone calling inode_detach_wb() to that
> > moment. A complication is that i_wb can point to non-root wb_writeback
> > structure and in that case we do need to clean it up as bdi_unregister()
> > blocks waiting for all non-root wb_writeback references to get dropped.
> > Thus this i_wb reference could block device removal e.g. from
> > __scsi_remove_device() (which indirectly ends up calling
> > bdi_unregister()). We cannot rely on block device inode to go away soon
> > (and thus i_wb reference to get dropped) as the device may got
> > hot-removed e.g. under a mounted filesystem. We deal with these issues
> > by switching block device inode from non-root wb_writeback structure to
> > bdi->wb when needed. Since this is rather expensive (requires
> > synchronize_rcu()) we do the switching only in del_gendisk() when we
> > know the device is going away.
>
> So, the only reason cgwb_bdi_destroy() is synchronous is because bdi
> destruction was synchronous. Now that bdi is properly reference
> counted and can be decoupled from gendisk / q destruction, I can't
> think of a reason to keep cgwb destruction synchronous. Switching
> wb's on destruction is kinda clumsy and it almost always hurts to
> expose synchronize_rcu() in userland visible paths.
>
> Wouldn't something like the following work?
>
> * Remove bdi->usage_cnt and the synchronous waiting in
> cgwb_bdi_destroy().
>
> * Instead, make cgwb's hold bdi->refcnt and put it from
> cgwb_release_workfn().
>
> Then, we don't have to switch during shutdown and can just let things
> drain.
At first sight this looks workable and would mean less special code so I
like it. I'll experiment with it and see how it works out.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-02-20 16:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-09 12:44 [PATCH 0/10] block: Fix block device shutdown related races Jan Kara
2017-02-09 12:44 ` [PATCH 01/10] block: Move bdev_unhash_inode() after invalidate_partition() Jan Kara
2017-02-12 3:58 ` Tejun Heo
2017-02-20 14:53 ` Jan Kara
2017-02-09 12:44 ` [PATCH 02/10] block: Unhash also block device inode for the whole device Jan Kara
2017-02-12 4:16 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 03/10] block: Revalidate i_bdev reference in bd_aquire() Jan Kara
2017-02-09 15:54 ` Jan Kara
2017-02-12 4:22 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 04/10] block: Move bdi_unregister() to del_gendisk() Jan Kara
2017-02-10 2:21 ` NeilBrown
2017-02-12 4:31 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 05/10] writeback: Generalize and standardize I_SYNC waiting function Jan Kara
2017-02-12 4:32 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 06/10] writeback: Move __inode_wait_for_state_bit Jan Kara
2017-02-09 12:44 ` [PATCH 07/10] writeback: Implement reliable switching to default writeback structure Jan Kara
2017-02-10 2:19 ` NeilBrown
2017-02-10 13:20 ` Jan Kara
2017-02-09 12:44 ` [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list() Jan Kara
2017-02-12 4:40 ` Tejun Heo
2017-02-20 16:58 ` Jan Kara [this message]
2017-02-09 12:44 ` [PATCH 09/10] kobject: Export kobject_get_unless_zero() Jan Kara
2017-02-12 4:41 ` Tejun Heo
2017-02-09 12:44 ` [PATCH 10/10] block: Fix oops scsi_disk_get() Jan Kara
2017-02-12 4:43 ` Tejun Heo
2017-02-09 14:52 ` [PATCH 0/10] block: Fix block device shutdown related races Thiago Jung Bauermann
2017-02-09 15:48 ` Jan Kara
2017-02-13 14:27 ` Thiago Jung Bauermann
-- strict thread matches above, loose matches on Subject: below --
2017-03-23 0:36 [PATCH 0/10 v5] " Jan Kara
2017-03-23 0:37 ` [PATCH 08/10] block: Fix oops in locked_inode_to_wb_and_lock_list() Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170220165811.GD2315@quack2.suse.cz \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=bauerman@linux.vnet.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=neilb@suse.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.