From: Tejun Heo <tj@kernel.org>
To: Jens Axboe <axboe@kernel.dk>,
Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sitsofe Wheeler <sitsofe@yahoo.com>,
Borislav Petkov <bp@alien8.de>, Meelis Roos <mroos@linux.ee>,
Andrew Morton <akpm@linux-foundation.org>,
Kay Sievers <kay.sievers@vrfy.org>,
linux-kernel@vger.kernel.org
Subject: [PATCH RESEND 1/3 v2.6.39-rc7] block: don't use non-syncing event blocking in disk_check_events()
Date: Tue, 17 May 2011 12:27:13 +0200 [thread overview]
Message-ID: <20110517102713.GJ20624@htj.dyndns.org> (raw)
This patch is part of fix for triggering of WARN_ON_ONCE() in
disk_clear_events() reported in bug#34662.
https://bugzilla.kernel.org/show_bug.cgi?id=34662
disk_clear_events() blocks events, schedules and flushes the event
work. It expects the work to have started execution on schedule and
finished on return from flush. WARN_ON_ONCE() triggers if the event
work hasn't executed as expected. This problem happens because
__disk_block_events() fails to guarantee that the event work item is
not in flight on return from the function in race-free manner. The
problem is two-fold and this patch addresses one of them.
When __disk_block_events() is called with @sync == %false, it bumps
event block count, calls cancel_delayed_work() and return. This makes
it impossible to guarantee that event polling is not in flight on
return from syncing __disk_block_events() - if the first blocker was
non-syncing, polling could still be in progress and later syncing ones
would assume that the first blocker already canceled it.
Making __disk_block_events() cancel_sync regardless of block count
isn't feasible either as it may race with forced event checking in
disk_clear_events().
As disk_check_events() is the only user of non-syncing
__disk_block_events(), updating it to directly cancel and schedule
event work is the easiest way to solve the issue.
Note that there's another bug in __disk_block_events() and this patch
doesn't fix the issue completely. Later patch will fix the other bug.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
---
(sorry, forgot to cc lkml, resending)
This is the first of three patches which (finally) fix the
WARN_ON_ONCE() in disk_clear_events() triggering. It was me being
stupid about synchronization around event blocking.
Given that we're very late in -rc cycle and, although the fix isn't
invasive, it isn't obvious one-liner either, and that the bug happens
sporadically with non-critical failure mode, it might be better to
route this through block for v2.6.40-rc1 and then back port to v2.6.39
via -stable, unless v2.6.39 is gonna go through another -rc cycle.
Jens, Linus, what do you guys think?
Thank you.
block/genhd.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
Index: work/block/genhd.c
===================================================================
--- work.orig/block/genhd.c
+++ work/block/genhd.c
@@ -1508,10 +1508,18 @@ void disk_unblock_events(struct gendisk
*/
void disk_check_events(struct gendisk *disk)
{
- if (disk->ev) {
- __disk_block_events(disk, false);
- __disk_unblock_events(disk, true);
+ struct disk_events *ev = disk->ev;
+ unsigned long flags;
+
+ if (!ev)
+ return;
+
+ spin_lock_irqsave(&ev->lock, flags);
+ if (!ev->block) {
+ cancel_delayed_work(&ev->dwork);
+ queue_delayed_work(system_nrt_wq, &ev->dwork, 0);
}
+ spin_unlock_irqrestore(&ev->lock, flags);
}
EXPORT_SYMBOL_GPL(disk_check_events);
next reply other threads:[~2011-05-17 10:27 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-17 10:27 Tejun Heo [this message]
2011-05-17 10:28 ` [PATCH RESEND 2/3 v2.6.39-rc7] block: remove non-syncing __disk_block_events() and fold it into disk_block_events() Tejun Heo
2011-05-17 10:28 ` [PATCH RESEND 2/3 v2.6.39-rc7] block: make disk_block_events() properly wait for work cancellation Tejun Heo
2011-05-17 14:46 ` Linus Torvalds
2011-05-17 15:11 ` Tejun Heo
2011-05-17 15:15 ` Linus Torvalds
2011-05-17 15:27 ` Tejun Heo
2011-05-17 22:40 ` Linus Torvalds
2011-05-18 5:07 ` Tejun Heo
2011-05-18 9:46 ` Linus Torvalds
2011-05-18 10:04 ` Tejun Heo
2011-05-18 11:07 ` Tejun Heo
2011-05-18 10:26 ` Jens Axboe
2011-05-17 15:47 ` [PATCH UPDATED " Tejun Heo
2011-05-17 19:34 ` Jens Axboe
2011-05-17 20:22 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110517102713.GJ20624@htj.dyndns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bp@alien8.de \
--cc=kay.sievers@vrfy.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mroos@linux.ee \
--cc=sitsofe@yahoo.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).