All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	Tejun Heo <tj@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Alasdair G Kergon <agk@redhat.com>
Subject: Re: [PATCH 2/2] dm: stay in blk_queue_bypass until queue becomes initialized
Date: Mon, 29 Oct 2012 19:15:08 +0900	[thread overview]
Message-ID: <508E572C.6000707@ce.jp.nec.com> (raw)
In-Reply-To: <20121026202105.GF24687@redhat.com>

On 10/27/12 05:21, Vivek Goyal wrote:
> On Thu, Oct 25, 2012 at 06:41:11PM +0900, Jun'ichi Nomura wrote:
>> [PATCH] dm: stay in blk_queue_bypass until queue becomes initialized
>>
>> With 749fefe677 ("block: lift the initial queue bypass mode on
>> blk_register_queue() instead of blk_init_allocated_queue()"),
>> add_disk() eventually calls blk_queue_bypass_end().
>> This change invokes the following warning when multipath is used.
>>
>>   BUG: scheduling while atomic: multipath/2460/0x00000002
>>   1 lock held by multipath/2460:
>>    #0:  (&md->type_lock){......}, at: [<ffffffffa019fb05>] dm_lock_md_type+0x17/0x19 [dm_mod]
>>   Modules linked in: ...
>>   Pid: 2460, comm: multipath Tainted: G        W    3.7.0-rc2 #1
>>   Call Trace:
>>    [<ffffffff810723ae>] __schedule_bug+0x6a/0x78
>>    [<ffffffff81428ba2>] __schedule+0xb4/0x5e0
>>    [<ffffffff814291e6>] schedule+0x64/0x66
>>    [<ffffffff8142773a>] schedule_timeout+0x39/0xf8
>>    [<ffffffff8108ad5f>] ? put_lock_stats+0xe/0x29
>>    [<ffffffff8108ae30>] ? lock_release_holdtime+0xb6/0xbb
>>    [<ffffffff814289e3>] wait_for_common+0x9d/0xee
>>    [<ffffffff8107526c>] ? try_to_wake_up+0x206/0x206
>>    [<ffffffff810c0eb8>] ? kfree_call_rcu+0x1c/0x1c
>>    [<ffffffff81428aec>] wait_for_completion+0x1d/0x1f
>>    [<ffffffff810611f9>] wait_rcu_gp+0x5d/0x7a
>>    [<ffffffff81061216>] ? wait_rcu_gp+0x7a/0x7a
>>    [<ffffffff8106fb18>] ? complete+0x21/0x53
>>    [<ffffffff810c0556>] synchronize_rcu+0x1e/0x20
>>    [<ffffffff811dd903>] blk_queue_bypass_start+0x5d/0x62
>>    [<ffffffff811ee109>] blkcg_activate_policy+0x73/0x270
>>    [<ffffffff81130521>] ? kmem_cache_alloc_node_trace+0xc7/0x108
>>    [<ffffffff811f04b3>] cfq_init_queue+0x80/0x28e
>>    [<ffffffffa01a1600>] ? dm_blk_ioctl+0xa7/0xa7 [dm_mod]
>>    [<ffffffff811d8c41>] elevator_init+0xe1/0x115
>>    [<ffffffff811e229f>] ? blk_queue_make_request+0x54/0x59
>>    [<ffffffff811dd743>] blk_init_allocated_queue+0x8c/0x9e
>>    [<ffffffffa019ffcd>] dm_setup_md_queue+0x36/0xaa [dm_mod]
>>    [<ffffffffa01a60e6>] table_load+0x1bd/0x2c8 [dm_mod]
>>    [<ffffffffa01a7026>] ctl_ioctl+0x1d6/0x236 [dm_mod]
>>    [<ffffffffa01a5f29>] ? table_clear+0xaa/0xaa [dm_mod]
>>    [<ffffffffa01a7099>] dm_ctl_ioctl+0x13/0x17 [dm_mod]
>>    [<ffffffff811479fc>] do_vfs_ioctl+0x3fb/0x441
>>    [<ffffffff811b643c>] ? file_has_perm+0x8a/0x99
>>    [<ffffffff81147aa0>] sys_ioctl+0x5e/0x82
>>    [<ffffffff812010be>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>    [<ffffffff814310d9>] system_call_fastpath+0x16/0x1b
>>
>> The warning means during queue initialization blk_queue_bypass_start()
>> calls sleeping function (synchronize_rcu) while dm holds md->type_lock.
> 
> md->type_lock is a mutex, isn't it? I thought we are allowed to block
> and schedule out under mutex?

Hm, you are right. It's a mutex.
The warning occurs only if I turned on CONFIG_PREEMPT=y.

> add_disk() also calls disk_alloc_events() which does kzalloc(GFP_KERNEL).
> So we already have code which can block/wait under md->type_lock. I am
> not sure why should we get this warning under a mutex.

add_disk() is called without md->type_lock.

Call flow is like this:

dm_create
  alloc_dev
    blk_alloc_queue
    alloc_disk
    add_disk
      blk_queue_bypass_end [with 3.7-rc2]

table_load
  dm_lock_md_type [takes md->type_lock]
  dm_setup_md_queue
    blk_init_allocated_queue [when DM_TYPE_REQUEST_BASED]
      elevator_init
        blkcg_activate_policy
          blk_queue_bypass_start <-- THIS triggers the warning
          blk_queue_bypass_end
      blk_queue_bypass_end [with 3.6]
  dm_unlock_md_type

blk_queue_bypass_start() in blkcg_activate_policy was nested call,
that did nothing, with 3.6.
With 3.7-rc2, it becomes the initial call and does
actual draining stuff.

-- 
Jun'ichi Nomura, NEC Corporation

  reply	other threads:[~2012-10-29 10:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-25  9:41 [PATCH 2/2] dm: stay in blk_queue_bypass until queue becomes initialized Jun'ichi Nomura
2012-10-26  1:42 ` Jun'ichi Nomura
2012-10-26 20:21 ` Vivek Goyal
2012-10-29 10:15   ` Jun'ichi Nomura [this message]
2012-10-29 16:38     ` Vivek Goyal
2012-10-29 16:45       ` Peter Zijlstra
2012-10-29 17:13         ` Vivek Goyal
2012-10-30  2:25           ` [PATCH] blkcg: fix "scheduling while atomic" in blk_queue_bypass_start Jun'ichi Nomura
2012-10-30 13:21             ` Vivek Goyal
2013-01-08  7:31           ` [PATCH repost] " Jun'ichi Nomura
2013-01-09 15:52             ` Vivek Goyal
2013-01-09 15:55             ` Tejun Heo
2013-02-26  4:53           ` Jun'ichi Nomura
2013-02-26  4:53             ` Jun'ichi Nomura
2012-10-29 16:55       ` [PATCH 2/2] dm: stay in blk_queue_bypass until queue becomes initialized Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=508E572C.6000707@ce.jp.nec.com \
    --to=j-nomura@ce.jp.nec.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.