From: Jan Kara <jack@suse.cz>
To: linux-block@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Jan Kara <jack@suse.cz>
Subject: [PATCH] bdi: Fix another oops in wb_workfn()
Date: Mon, 18 Jun 2018 15:46:58 +0200 [thread overview]
Message-ID: <20180618134658.32302-1-jack@suse.cz> (raw)
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.
The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:
CPU1 CPU2
cgwb_bdi_unregister()
cgwb_kill(*slot);
cgwb_release()
queue_work(cgwb_release_wq, &wb->release_work);
cgwb_release_workfn()
wb = list_first_entry(&bdi->wb_list, ...)
spin_unlock_irq(&cgwb_lock);
wb_shutdown(wb);
...
kfree_rcu(wb, rcu);
wb_shutdown(wb); -> oops use-after-free
We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().
Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
include/linux/backing-dev-defs.h | 2 +-
mm/backing-dev.c | 20 +++++++-------------
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 0bd432a4d7bd..24251762c20c 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -22,7 +22,6 @@ struct dentry;
*/
enum wb_state {
WB_registered, /* bdi_register() was done */
- WB_shutting_down, /* wb_shutdown() in progress */
WB_writeback_running, /* Writeback is in progress */
WB_has_dirty_io, /* Dirty inodes on ->b_{dirty|io|more_io} */
WB_start_all, /* nr_pages == 0 (all) work pending */
@@ -189,6 +188,7 @@ struct backing_dev_info {
#ifdef CONFIG_CGROUP_WRITEBACK
struct radix_tree_root cgwb_tree; /* radix tree of active cgroup wbs */
struct rb_root cgwb_congested_tree; /* their congested states */
+ struct mutex cgwb_release_mutex; /* protect shutdown of wb structs */
#else
struct bdi_writeback_congested *wb_congested;
#endif
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 347cc834c04a..2e5d3df0853d 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -359,15 +359,8 @@ static void wb_shutdown(struct bdi_writeback *wb)
spin_lock_bh(&wb->work_lock);
if (!test_and_clear_bit(WB_registered, &wb->state)) {
spin_unlock_bh(&wb->work_lock);
- /*
- * Wait for wb shutdown to finish if someone else is just
- * running wb_shutdown(). Otherwise we could proceed to wb /
- * bdi destruction before wb_shutdown() is finished.
- */
- wait_on_bit(&wb->state, WB_shutting_down, TASK_UNINTERRUPTIBLE);
return;
}
- set_bit(WB_shutting_down, &wb->state);
spin_unlock_bh(&wb->work_lock);
cgwb_remove_from_bdi_list(wb);
@@ -379,12 +372,6 @@ static void wb_shutdown(struct bdi_writeback *wb)
mod_delayed_work(bdi_wq, &wb->dwork, 0);
flush_delayed_work(&wb->dwork);
WARN_ON(!list_empty(&wb->work_list));
- /*
- * Make sure bit gets cleared after shutdown is finished. Matches with
- * the barrier provided by test_and_clear_bit() above.
- */
- smp_wmb();
- clear_and_wake_up_bit(WB_shutting_down, &wb->state);
}
static void wb_exit(struct bdi_writeback *wb)
@@ -508,10 +495,12 @@ static void cgwb_release_workfn(struct work_struct *work)
struct bdi_writeback *wb = container_of(work, struct bdi_writeback,
release_work);
+ mutex_lock(&wb->bdi->cgwb_release_mutex);
wb_shutdown(wb);
css_put(wb->memcg_css);
css_put(wb->blkcg_css);
+ mutex_unlock(&wb->bdi->cgwb_release_mutex);
fprop_local_destroy_percpu(&wb->memcg_completions);
percpu_ref_exit(&wb->refcnt);
@@ -697,6 +686,7 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
INIT_RADIX_TREE(&bdi->cgwb_tree, GFP_ATOMIC);
bdi->cgwb_congested_tree = RB_ROOT;
+ mutex_init(&bdi->cgwb_release_mutex);
ret = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
if (!ret) {
@@ -717,7 +707,10 @@ static void cgwb_bdi_unregister(struct backing_dev_info *bdi)
spin_lock_irq(&cgwb_lock);
radix_tree_for_each_slot(slot, &bdi->cgwb_tree, &iter, 0)
cgwb_kill(*slot);
+ spin_unlock_irq(&cgwb_lock);
+ mutex_lock(&bdi->cgwb_release_mutex);
+ spin_lock_irq(&cgwb_lock);
while (!list_empty(&bdi->wb_list)) {
wb = list_first_entry(&bdi->wb_list, struct bdi_writeback,
bdi_node);
@@ -726,6 +719,7 @@ static void cgwb_bdi_unregister(struct backing_dev_info *bdi)
spin_lock_irq(&cgwb_lock);
}
spin_unlock_irq(&cgwb_lock);
+ mutex_unlock(&bdi->cgwb_release_mutex);
}
/**
--
2.16.4
next reply other threads:[~2018-06-18 13:46 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-18 13:46 Jan Kara [this message]
2018-06-18 14:38 ` [PATCH] bdi: Fix another oops in wb_workfn() Tetsuo Handa
2018-06-19 8:41 ` Jan Kara
2018-06-18 17:40 ` Tejun Heo
2018-06-22 8:52 ` Jan Kara
2018-06-22 18:08 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2018-06-05 13:45 general protection fault in wb_workfn (2) Tetsuo Handa
2018-06-07 18:46 ` Dmitry Vyukov
2018-06-08 2:31 ` Tetsuo Handa
2018-06-08 14:45 ` Dmitry Vyukov
2018-06-08 15:16 ` Dmitry Vyukov
2018-06-08 16:53 ` Dmitry Vyukov
2018-06-08 17:14 ` Dmitry Vyukov
2018-06-09 5:30 ` Tetsuo Handa
2018-06-09 14:00 ` [PATCH] bdi: Fix another oops in wb_workfn() Tetsuo Handa
2018-06-11 9:12 ` Jan Kara
2018-06-11 16:01 ` Tejun Heo
2018-06-11 16:29 ` Jan Kara
2018-06-11 17:20 ` Tejun Heo
2018-06-12 15:57 ` Jan Kara
2018-06-13 10:43 ` Tetsuo Handa
2018-06-13 11:51 ` Tetsuo Handa
2018-06-13 14:06 ` Linus Torvalds
2018-06-13 14:46 ` Jan Kara
2018-06-13 14:55 ` Linus Torvalds
2018-06-13 16:20 ` Tetsuo Handa
2018-06-13 16:25 ` Linus Torvalds
2018-06-13 16:45 ` Jan Kara
2018-06-13 21:04 ` Tetsuo Handa
2018-06-14 10:11 ` Jan Kara
2018-06-13 14:33 ` Tejun Heo
2018-06-15 12:06 ` Jan Kara
2018-06-18 12:27 ` Jan Kara
[not found] <000000000000cbd959056d1851ca@google.com>
2018-05-27 0:47 ` general protection fault in wb_workfn (2) Tetsuo Handa
2018-05-27 2:21 ` [PATCH] bdi: Fix another oops in wb_workfn() Tetsuo Handa
2018-05-27 2:36 ` Tejun Heo
2018-05-27 4:43 ` Tetsuo Handa
2018-05-29 13:46 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180618134658.32302-1-jack@suse.cz \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox