From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>, Sitsofe Wheeler <sitsofe@yahoo.com>,
Borislav Petkov <bp@alien8.de>, Meelis Roos <mroos@linux.ee>,
Andrew Morton <akpm@linux-foundation.org>,
Kay Sievers <kay.sievers@vrfy.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RESEND 2/3 v2.6.39-rc7] block: make disk_block_events() properly wait for work cancellation
Date: Tue, 17 May 2011 07:46:25 -0700 [thread overview]
Message-ID: <BANLkTimj2guK6kw7QSijYZyvoRY+JKgbTg@mail.gmail.com> (raw)
In-Reply-To: <20110517102853.GL20624@htj.dyndns.org>
This is pretty disgusting.
You're not using a real lock, and to compensate for that you use a
bloccking bit-lock hack. And to make that hack extra ugly, you define
the bit as a bitmask, and use the ilog2() macro to turn it into a bit
pos.
Horrid. Horrid.
Is there some fundamental reason why you cannot just turn the ev->lock
into a real semaphore (allowing blocking), and then doing the dwork
cancel under the semaphore - avoiding all the crazy bit-lock crud.
Or just _add_ a semaphore to the 'struct disk_events', for chrissake.
This is just too ugly to survive. And even if you fixed the ilog()
(hint: just define the bit, and then use (1u<<BIT) to define the
mask), it would be too ugly.
Don't do these kinds of ad-hock locks. They are WRONG.
Linus
On Tue, May 17, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:
> disk_block_events() should guarantee that the event work is not in
> flight on return and once blocked it shouldn't issue further
> cancellations.
>
> Because there was no synchronization between the first blocker doing
> cancel_delayed_work_sync() and the following blockers, the following
> blockers could finish before cancellation was complete, which broke
> both guarantees - event work could be in flight and cancellation could
> happen after return.
>
> This bug triggered WARN_ON_ONCE() in disk_clear_events() reported in
> bug#34662.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=34662
>
> Fix it by introducing DISK_EVENT_CANCELING bit which is set by the
> first blocker while cancellation is in progress. Further blockers
> wait until the bit is cleared by the first blocker.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
> Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
> Reported-by: Borislav Petkov <bp@alien8.de>
> Reported-by: Meelis Roos <mroos@linux.ee>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Kay Sievers <kay.sievers@vrfy.org>
> ---
> block/genhd.c | 36 +++++++++++++++++++++++++++++++++---
> 1 file changed, 33 insertions(+), 3 deletions(-)
>
> Index: work/block/genhd.c
> ===================================================================
> --- work.orig/block/genhd.c
> +++ work/block/genhd.c
> @@ -1371,7 +1371,7 @@ struct disk_events {
> struct gendisk *disk; /* the associated disk */
> spinlock_t lock;
>
> - int block; /* event blocking depth */
> + unsigned int block; /* event blocking depth */
> unsigned int pending; /* events already sent out */
> unsigned int clearing; /* events being cleared */
>
> @@ -1379,6 +1379,8 @@ struct disk_events {
> struct delayed_work dwork;
> };
>
> +#define DISK_EVENT_CANCELING 0x80000000U
> +
> static const char *disk_events_strs[] = {
> [ilog2(DISK_EVENT_MEDIA_CHANGE)] = "media_change",
> [ilog2(DISK_EVENT_EJECT_REQUEST)] = "eject_request",
> @@ -1414,6 +1416,12 @@ static unsigned long disk_events_poll_ji
> return msecs_to_jiffies(intv_msecs);
> }
>
> +static int disk_block_wait_canceling(void *word)
> +{
> + schedule();
> + return 0;
> +}
> +
> /**
> * disk_block_events - block and flush disk event checking
> * @disk: disk to block events for
> @@ -1438,12 +1446,34 @@ void disk_block_events(struct gendisk *d
> if (!ev)
> return;
>
> + /*
> + * Bump block count and set CANCELLING if we're the first blocker
> + * and have to cancel the event work.
> + */
> spin_lock_irqsave(&ev->lock, flags);
> - cancel = !ev->block++;
> + if ((cancel = !ev->block++))
> + ev->block |= DISK_EVENT_CANCELING;
> spin_unlock_irqrestore(&ev->lock, flags);
>
> - if (cancel)
> + if (cancel) {
> + /*
> + * Cancel the event work, clear CANCELING and wake up
> + * waiters.
> + */
> cancel_delayed_work_sync(&disk->ev->dwork);
> +
> + spin_lock_irqsave(&ev->lock, flags);
> + ev->block &= ~DISK_EVENT_CANCELING;
> + spin_unlock_irqrestore(&ev->lock, flags);
> + wake_up_bit(&ev->block, ilog2(DISK_EVENT_CANCELING));
> + } else {
> + /*
> + * The first blocker might not have finished canceling the
> + * event work. Wait for CANCELING to clear.
> + */
> + wait_on_bit(&ev->block, ilog2(DISK_EVENT_CANCELING),
> + disk_block_wait_canceling, TASK_UNINTERRUPTIBLE);
> + }
> }
>
> static void __disk_unblock_events(struct gendisk *disk, bool check_now)
>
next prev parent reply other threads:[~2011-05-17 14:46 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-17 10:27 [PATCH RESEND 1/3 v2.6.39-rc7] block: don't use non-syncing event blocking in disk_check_events() Tejun Heo
2011-05-17 10:28 ` [PATCH RESEND 2/3 v2.6.39-rc7] block: remove non-syncing __disk_block_events() and fold it into disk_block_events() Tejun Heo
2011-05-17 10:28 ` [PATCH RESEND 2/3 v2.6.39-rc7] block: make disk_block_events() properly wait for work cancellation Tejun Heo
2011-05-17 14:46 ` Linus Torvalds [this message]
2011-05-17 15:11 ` Tejun Heo
2011-05-17 15:15 ` Linus Torvalds
2011-05-17 15:27 ` Tejun Heo
2011-05-17 22:40 ` Linus Torvalds
2011-05-18 5:07 ` Tejun Heo
2011-05-18 9:46 ` Linus Torvalds
2011-05-18 10:04 ` Tejun Heo
2011-05-18 11:07 ` Tejun Heo
2011-05-18 10:26 ` Jens Axboe
2011-05-17 15:47 ` [PATCH UPDATED " Tejun Heo
2011-05-17 19:34 ` Jens Axboe
2011-05-17 20:22 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BANLkTimj2guK6kw7QSijYZyvoRY+JKgbTg@mail.gmail.com \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bp@alien8.de \
--cc=kay.sievers@vrfy.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mroos@linux.ee \
--cc=sitsofe@yahoo.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).