From: Jan Kara <jack@suse.cz>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 2/2 RESEND] bdi: Avoid oops on device removal
Date: Fri, 10 Oct 2014 16:23:43 +0200 [thread overview]
Message-ID: <1412951028-4085-39-git-send-email-jack@suse.cz> (raw)
In-Reply-To: <1412951028-4085-1-git-send-email-jack@suse.cz>
After 839a8e8660b67 "writeback: replace custom worker pool
implementation with unbound workqueue" when device is removed while we
are writing to it we crash in bdi_writeback_workfn() ->
set_worker_desc() because bdi->dev is NULL. This can happen because
even though bdi_unregister() cancels all pending flushing work, nothing
really prevents new ones from being queued from balance_dirty_pages() or
other places.
Fix the problem by clearing BDI_registered bit in bdi_unregister() and
checking it before scheduling of any flushing work.
Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
CC: stable at vger.kernel.org
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/fs-writeback.c | 23 ++++++++++++++++++-----
include/linux/backing-dev.h | 2 +-
mm/backing-dev.c | 13 +++++++++----
3 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8277b76be983..c24e7b8b521b 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -94,16 +94,29 @@ static inline struct inode *wb_inode(struct list_head *head)
#define CREATE_TRACE_POINTS
#include <trace/events/writeback.h>
+static void bdi_wakeup_thread(struct backing_dev_info *bdi)
+{
+ spin_lock_bh(&bdi->wb_lock);
+ if (test_bit(BDI_registered, &bdi->state))
+ mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+ spin_unlock_bh(&bdi->wb_lock);
+}
+
static void bdi_queue_work(struct backing_dev_info *bdi,
struct wb_writeback_work *work)
{
trace_writeback_queue(bdi, work);
spin_lock_bh(&bdi->wb_lock);
+ if (!test_bit(BDI_registered, &bdi->state)) {
+ if (work->done)
+ complete(work->done);
+ goto out_unlock;
+ }
list_add_tail(&work->list, &bdi->work_list);
- spin_unlock_bh(&bdi->wb_lock);
-
mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+out_unlock:
+ spin_unlock_bh(&bdi->wb_lock);
}
static void
@@ -119,7 +132,7 @@ __bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages,
work = kzalloc(sizeof(*work), GFP_ATOMIC);
if (!work) {
trace_writeback_nowork(bdi);
- mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+ bdi_wakeup_thread(bdi);
return;
}
@@ -166,7 +179,7 @@ void bdi_start_background_writeback(struct backing_dev_info *bdi)
* writeback as soon as there is no other work to do.
*/
trace_writeback_wake_background(bdi);
- mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+ bdi_wakeup_thread(bdi);
}
/*
@@ -1025,7 +1038,7 @@ void bdi_writeback_workfn(struct work_struct *work)
current->flags |= PF_SWAPWRITE;
if (likely(!current_is_workqueue_rescuer() ||
- list_empty(&bdi->bdi_list))) {
+ !test_bit(BDI_registered, &bdi->state))) {
/*
* The normal path. Keep writing back @bdi until its
* work_list is empty. Note that this path is also taken
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 24819001f5c8..e488e9459a93 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -95,7 +95,7 @@ struct backing_dev_info {
unsigned int max_ratio, max_prop_frac;
struct bdi_writeback wb; /* default writeback info for this bdi */
- spinlock_t wb_lock; /* protects work_list */
+ spinlock_t wb_lock; /* protects work_list & wb.dwork scheduling */
struct list_head work_list;
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index fab8401fc54e..09d9591b7708 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -297,7 +297,10 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
unsigned long timeout;
timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
- queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+ spin_lock_bh(&bdi->wb_lock);
+ if (test_bit(BDI_registered, &bdi->state))
+ queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+ spin_unlock_bh(&bdi->wb_lock);
}
/*
@@ -310,9 +313,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
spin_unlock_bh(&bdi_lock);
synchronize_rcu_expedited();
-
- /* bdi_list is now unused, clear it to mark @bdi dying */
- INIT_LIST_HEAD(&bdi->bdi_list);
}
int bdi_register(struct backing_dev_info *bdi, struct device *parent,
@@ -363,6 +363,11 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
*/
bdi_remove_from_list(bdi);
+ /* Make sure nobody queues further work */
+ spin_lock_bh(&bdi->wb_lock);
+ clear_bit(BDI_registered, &bdi->state);
+ spin_unlock_bh(&bdi->wb_lock);
+
/*
* Drain work list and shutdown the delayed_work. At this point,
* @bdi->bdi_list is empty telling bdi_Writeback_workfn() that @bdi
--
1.8.1.4
next prev parent reply other threads:[~2014-10-10 14:23 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-10 14:23 [Cluster-devel] [PATCH 0/2 v2] Fix data corruption when blocksize < pagesize for mmapped data Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2 RESEND] bdi: Fix hung task on sync Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] block: free q->flush_rq in blk_init_allocated_queue error paths Jan Kara
2014-10-10 15:19 ` Dave Jones
2014-10-10 15:32 ` Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] block: improve rq_affinity placement Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] block: Make rq_affinity = 1 work as expected Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] block: strict rq_affinity Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext3: Fix deadlock in data=journal mode when fs is frozen Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext3: Speedup WB_SYNC_ALL pass Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Avoid lock inversion between i_mmap_mutex and transaction start Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2] ext4: Don't check quota format when there are no quota files Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2] ext4: Fix block zeroing when punching holes in indirect block files Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Fix buffer double free in ext4_alloc_branch() Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Fix jbd2 warning under heavy xattr load Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Fix zeroing of page during writeback Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Remove orphan list handling Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ext4: Speedup WB_SYNC_ALL pass Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH for 3.14-stable] fanotify: fix double free of pending permission events Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] fs: Avoid userspace mounting anon_inodefs filesystem Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2] jbd2: Avoid pointless scanning of checkpoint lists Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] jbd2: Optimize jbd2_log_do_checkpoint() a bit Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] lockdep: Dump info via tracing Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] mm: Fixup pagecache_isize_extended() definitions for !CONFIG_MMU Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ncpfs: fix rmdir returns Device or resource busy Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] ocfs2: Fix quota file corruption Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2] printk: Debug patch1 Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] printk: debug: Slow down printing to 9600 bauds Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] printk: enable interrupts before calling console_trylock_for_printk() Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] quota: Fix race between dqput() and dquot_scan_active() Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] scsi: Keep interrupts disabled while submitting requests Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] sync: don't block the flusher thread waiting on IO Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] udf: Avoid infinite loop when processing indirect ICBs Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] udf: Print error when inode is loaded Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] vfs: Allocate anon_inode_inode in anon_inode_init() Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 1/2] vfs: Fix data corruption when blocksize < pagesize for mmaped data Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH RESEND] vfs: Return EINVAL for default SEEK_HOLE, SEEK_DATA implementation Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] writeback: plug writeback at a high level Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH] x86: Fixup lockdep complaint caused by io apic code Jan Kara
2014-10-10 14:23 ` Jan Kara [this message]
2014-10-10 14:23 ` [Cluster-devel] [PATCH 2/2] ext3: Don't check quota format when there are no quota files Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 2/2] ext4: Fix hole punching for files with indirect blocks Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 2/2] ext4: Fix mmap data corruption when blocksize < pagesize Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 2/2] jbd2: Simplify calling convention around __jbd2_journal_clean_checkpoint_list Jan Kara
2014-10-10 14:23 ` [Cluster-devel] [PATCH 2/2] printk: Debug patch 2 Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1412951028-4085-39-git-send-email-jack@suse.cz \
--to=jack@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).