From: Hannes Reinecke <hare-l3A5Bk7waGM@public.gmane.org>
To: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
Subject: Re: Race condition between "read CFQ stats" and "block device shutdown"
Date: Wed, 04 Sep 2013 08:42:41 +0200 [thread overview]
Message-ID: <5226D661.7070301@suse.de> (raw)
In-Reply-To: <CAOMFOmXJ5ZTYdOvdUt-oxsouhPGRmMshCRhn6AFgmFAGZw5WZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 09/03/2013 10:14 PM, Anatol Pomozov wrote:
> Hi,
>
> I am running a program that checkes "read CFQ stat files" for race
> conditions with other evens (e.g. device shutdown).
>
> And I discovered an interesting bug. Here is the "double_unlock" crash for it
>
>
> print_unlock_imbalance_bug.isra.23+0x4/0x10
> [ 261.453775] [<ffffffff810f7c65>] lock_release_non_nested.isra.39+0x2f5/0x300
> [ 261.460900] [<ffffffff810f7cfe>] lock_release+0x8e/0x1f0
> [ 261.466293] [<ffffffff81339030>] ? cfqg_prfill_service_level+0x60/0x60
> [ 261.472894] [<ffffffff81005be3>] _raw_spin_unlock_irq+0x23/0x50
> [ 261.478894] [<ffffffff8133559f>] blkcg_print_blkgs+0x8f/0x140
> [ 261.484724] [<ffffffff81335515>] ? blkcg_print_blkgs+0x5/0x140
> [ 261.490631] [<ffffffff81338a7f>] cfqg_print_weighted_queue_time+0x2f/0x40
> [ 261.497489] [<ffffffff8110b793>] cgroup_seqfile_show+0x53/0x60
> [ 261.503398] [<ffffffff811f1fe4>] seq_read+0x124/0x3a0
> [ 261.508529] [<ffffffff811ce39d>] vfs_read+0xad/0x180
> [ 261.513576] [<ffffffff811ce625>] SyS_read+0x55/0xa0
> [ 261.518538] [<ffffffff81609f66>] cstar_dispatch+0x7/0x1f
>
> blkcg_print_blkgs fails with double unlock? Hmm, I checked
> cfqg_prfill_service_level and I did not find any places where unlock
> can happen.
>
> After some debugging I found that in blkcg_print_blkgs() spinlock
> passed to spin_lock_irq() function differs from the object passed to
> spin_unlock_irq just a few lines below. It means
> request_queue->queue_lock spinlock has changed under the function feet
> while it was executing!!!
>
> To make sure I added
>
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -465,10 +465,16 @@ void blkcg_print_blkgs(struct seq_file *sf,
> struct blkcg *blkcg,
>
> rcu_read_lock();
> hlist_for_each_entry_rcu(blkg, n, &blkcg->blkg_list, blkcg_node) {
> - spin_lock_irq(blkg->q->queue_lock);
> + spinlock_t *lock = blkg->q->queue_lock;
> + spinlock_t *new_lock;
> + spin_lock_irq(lock);
> if (blkcg_policy_enabled(blkg->q, pol))
> total += prfill(sf, blkg->pd[pol->plid], data);
> - spin_unlock_irq(blkg->q->queue_lock);
> + new_lock = blkg->q->queue_lock;
> + if (lock != new_lock) {
> + pr_err("old lock %p %s new lock %p %s\n",
> lock, lock->dep_map.name, new_lock, new_lock->dep_map.name);
> + }
> + spin_unlock_irq(lock);
> }
> rcu_read_unlock();
>
>
>
> And indeed it shows locks are different.
>
>
> It comes from this change 777eb1bf1 "block: Free queue resources at
> blk_release_queue()" that changes lock when devices is shutting down.
>
> What would be the best fix for the issue?
>
The correct fix would be to add checks for 'blkq->q'; the mentioned
lock reassignment can only happen during queue shutdown.
So whenever the queue is dead or stopping we whould refuse to print
anything here.
Try this:
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 290792a..3e17841 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -504,6 +504,8 @@ void blkcg_print_blkgs(struct seq_file *sf,
struct blkcg *bl
kcg,
rcu_read_lock();
hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) {
+ if (unlikely(blk_queue_dying(blkg->q)))
+ continue;
spin_lock_irq(blkg->q->queue_lock);
if (blkcg_policy_enabled(blkg->q, pol))
total += prfill(sf, blkg->pd[pol->plid], data);
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare-l3A5Bk7waGM@public.gmane.org +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
next prev parent reply other threads:[~2013-09-04 6:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-03 20:14 Race condition between "read CFQ stats" and "block device shutdown" Anatol Pomozov
[not found] ` <CAOMFOmXJ5ZTYdOvdUt-oxsouhPGRmMshCRhn6AFgmFAGZw5WZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-04 6:42 ` Hannes Reinecke [this message]
[not found] ` <5226D661.7070301-l3A5Bk7waGM@public.gmane.org>
2013-09-04 15:45 ` Anatol Pomozov
[not found] ` <CAOMFOmUCqXN1uaqBEWH3PStuZXvnvLw=YrARgv7DvqO6Y4bFPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-04 16:07 ` Tejun Heo
[not found] ` <20130904160723.GC26609-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-09-25 20:37 ` Anatol Pomozov
2013-09-26 13:54 ` Tejun Heo
2013-09-26 14:18 ` Hannes Reinecke
2013-09-26 14:20 ` Tejun Heo
[not found] ` <5244423A.2050107-l3A5Bk7waGM@public.gmane.org>
2013-09-26 16:23 ` Anatol Pomozov
2013-09-26 16:30 ` Tejun Heo
[not found] ` <CAOMFOmX2f35qWyTr7=1HNu=RMB_LMAmpMbYxSEsX1xgURhx_mg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-27 5:59 ` Hannes Reinecke
2013-09-04 16:15 ` Anatol Pomozov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5226D661.7070301@suse.de \
--to=hare-l3a5bk7wagm@public.gmane.org \
--cc=anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox