From: Jan Kara <jack@suse.cz>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Jan Kara <jack@suse.cz>,
linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] bdi: Fix another oops in wb_workfn()
Date: Tue, 19 Jun 2018 10:41:41 +0200 [thread overview]
Message-ID: <20180619084141.w6azzcmm2qleyfxw@quack2.suse.cz> (raw)
In-Reply-To: <e32deee5-97d4-9cc8-3eac-84e5ad9cc520@i-love.sakura.ne.jp>
On Mon 18-06-18 23:38:12, Tetsuo Handa wrote:
> On 2018/06/18 22:46, Jan Kara wrote:
> > syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
>
> [1] https://syzkaller.appspot.com/bug?id=e0818ccb7e46190b3f1038b0c794299208ed4206
>
> line is missing.
>
> > wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
> > WB_shutting_down after wb->bdi->dev became NULL. This indicates that
> > unregister_bdi() failed to call wb_shutdown() on one of wb objects.
> >
> > The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
> > drops bdi's reference to wb structures before going through the list of
> > wbs again and calling wb_shutdown() on each of them. This way the loop
> > iterating through all wbs can easily miss a wb if that wb has already
> > passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
> > from cgwb_release_workfn() and as a result fully shutdown bdi although
> > wb_workfn() for this wb structure is still running. In fact there are
> > also other ways cgwb_bdi_unregister() can race with
> > cgwb_release_workfn() leading e.g. to use-after-free issues:
> >
> > CPU1 CPU2
> > cgwb_bdi_unregister()
> > cgwb_kill(*slot);
> >
> > cgwb_release()
> > queue_work(cgwb_release_wq, &wb->release_work);
> > cgwb_release_workfn()
> > wb = list_first_entry(&bdi->wb_list, ...)
> > spin_unlock_irq(&cgwb_lock);
> > wb_shutdown(wb);
> > ...
> > kfree_rcu(wb, rcu);
> > wb_shutdown(wb); -> oops use-after-free
> >
> > We solve these issues by synchronizing writeback structure shutdown from
> > cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
> > way we also no longer need synchronization using WB_shutting_down as the
> > mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
> > CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
> > bdi_unregister().
>
> Wow, this patch removes WB_shutting_down.
Yes.
> A bit of worry for me is how long will this mutex_lock() sleep, for
> if there are a lot of wb objects to shutdown, sequentially doing
> wb_shutdown() might block someone's mutex_lock() for longer than
> khungtaskd's timeout period (typically 120 seconds) ?
That's a good question but since the bdi is going away in this case I
don't think the flusher work should take long to complete - the device is
removed from the system at this point so it won't do any IO.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2018-06-19 8:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-18 13:46 [PATCH] bdi: Fix another oops in wb_workfn() Jan Kara
2018-06-18 14:38 ` Tetsuo Handa
2018-06-19 8:41 ` Jan Kara [this message]
2018-06-18 17:40 ` Tejun Heo
2018-06-22 8:52 ` Jan Kara
2018-06-22 18:08 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2018-06-05 13:45 general protection fault in wb_workfn (2) Tetsuo Handa
2018-06-07 18:46 ` Dmitry Vyukov
2018-06-08 2:31 ` Tetsuo Handa
2018-06-08 14:45 ` Dmitry Vyukov
2018-06-08 15:16 ` Dmitry Vyukov
2018-06-08 16:53 ` Dmitry Vyukov
2018-06-08 17:14 ` Dmitry Vyukov
2018-06-09 5:30 ` Tetsuo Handa
2018-06-09 14:00 ` [PATCH] bdi: Fix another oops in wb_workfn() Tetsuo Handa
2018-06-11 9:12 ` Jan Kara
2018-06-11 16:01 ` Tejun Heo
2018-06-11 16:29 ` Jan Kara
2018-06-11 17:20 ` Tejun Heo
2018-06-12 15:57 ` Jan Kara
2018-06-13 10:43 ` Tetsuo Handa
2018-06-13 11:51 ` Tetsuo Handa
2018-06-13 14:06 ` Linus Torvalds
2018-06-13 14:46 ` Jan Kara
2018-06-13 14:55 ` Linus Torvalds
2018-06-13 16:20 ` Tetsuo Handa
2018-06-13 16:25 ` Linus Torvalds
2018-06-13 16:45 ` Jan Kara
2018-06-13 21:04 ` Tetsuo Handa
2018-06-14 10:11 ` Jan Kara
2018-06-13 14:33 ` Tejun Heo
2018-06-15 12:06 ` Jan Kara
2018-06-18 12:27 ` Jan Kara
[not found] <000000000000cbd959056d1851ca@google.com>
2018-05-27 0:47 ` general protection fault in wb_workfn (2) Tetsuo Handa
2018-05-27 2:21 ` [PATCH] bdi: Fix another oops in wb_workfn() Tetsuo Handa
2018-05-27 2:36 ` Tejun Heo
2018-05-27 4:43 ` Tetsuo Handa
2018-05-29 13:46 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180619084141.w6azzcmm2qleyfxw@quack2.suse.cz \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox