[PATCH block/for-linus] blkcg: don't call into policy draining if root

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone
@ 2014-07-05 22:43 Tejun Heo
  2014-07-12 13:30 ` Shirish Pargaonkar
  2014-07-12 15:56 ` Jens Axboe
  0 siblings, 2 replies; 3+ messages in thread
From: Tejun Heo @ 2014-07-05 22:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, Shirish Pargaonkar, Sasha Levin, Jet Chen

While a queue is being destroyed, all the blkgs are destroyed and its
->root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
   [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
   [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
   [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
   [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
   [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
   [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff81427505>] blk_put_queue+0x15/0x20
   [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
   [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
   [<ffffffff817930e2>] device_release+0x32/0xa0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff817934d7>] put_device+0x17/0x20
   [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
   [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
   [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
   [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
   [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
   [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
   [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
   [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
   [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b

776687bce42b ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Shirish Pargaonkar <spargaonkar@suse.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Jet Chen <jet.chen@intel.com>
Cc: stable@vger.kernel.org
---
 block/blk-cgroup.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index b9f4cc4..28d227c 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -872,6 +872,13 @@ void blkcg_drain_queue(struct request_queue *q)
 {
 	lockdep_assert_held(q->queue_lock);
 
+	/*
+	 * @q could be exiting and already have destroyed all blkgs as
+	 * indicated by NULL root_blkg.  If so, don't confuse policies.
+	 */
+	if (!q->root_blkg)
+		return;
+
 	blk_throtl_drain(q);
 }
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone
  2014-07-05 22:43 [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone Tejun Heo
@ 2014-07-12 13:30 ` Shirish Pargaonkar
  2014-07-12 15:56 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Shirish Pargaonkar @ 2014-07-12 13:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Sasha Levin, Jet Chen

On Sat, 2014-07-05 at 18:43 -0400, Tejun Heo wrote:
> While a queue is being destroyed, all the blkgs are destroyed and its
> ->root_blkg pointer is set to NULL.  If someone else starts to drain
> while the queue is in this state, the following oops happens.
> 
>   NULL pointer dereference at 0000000000000028
>   IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
>   PGD e4a1067 PUD b773067 PMD 0
>   Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>   Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
>   CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
>   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
>   RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
>   RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
>   RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>   RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
>   R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
>   R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
>   FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>   CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
>   Stack:
>    ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
>    ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
>    ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
>   Call Trace:
>    [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
>    [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
>    [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
>    [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
>    [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
>    [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
>    [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
>    [<ffffffff81454968>] kobject_cleanup+0x38/0x70
>    [<ffffffff81454848>] kobject_put+0x28/0x60
>    [<ffffffff81427505>] blk_put_queue+0x15/0x20
>    [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
>    [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
>    [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
>    [<ffffffff817930e2>] device_release+0x32/0xa0
>    [<ffffffff81454968>] kobject_cleanup+0x38/0x70
>    [<ffffffff81454848>] kobject_put+0x28/0x60
>    [<ffffffff817934d7>] put_device+0x17/0x20
>    [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
>    [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
>    [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
>    [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
>    [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
>    [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
>    [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
>    [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
>    [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b
> 
> 776687bce42b ("block, blk-mq: draining can't be skipped even if
> bypass_depth was non-zero") made it easier to trigger this bug by
> making blk_queue_bypass_start() drain even when it loses the first
> bypass test to blk_cleanup_queue(); however, the bug has always been
> there even before the commit as blk_queue_bypass_start() could race
> against queue destruction, win the initial bypass test but perform the
> actual draining after blk_cleanup_queue() already destroyed all blkgs.
> 
> Fix it by skippping calling into policy draining if all the blkgs are
> already gone.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Shirish Pargaonkar <spargaonkar@suse.com>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Reported-by: Jet Chen <jet.chen@intel.com>
> Cc: stable@vger.kernel.org
> ---
>  block/blk-cgroup.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index b9f4cc4..28d227c 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -872,6 +872,13 @@ void blkcg_drain_queue(struct request_queue *q)
>  {
>  	lockdep_assert_held(q->queue_lock);
>  
> +	/*
> +	 * @q could be exiting and already have destroyed all blkgs as
> +	 * indicated by NULL root_blkg.  If so, don't confuse policies.
> +	 */
> +	if (!q->root_blkg)
> +		return;
> +
>  	blk_throtl_drain(q);
>  }
>  
> 

Tested-by: Shirish Pargaonkar <spargaonkar@suse.com>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone
  2014-07-05 22:43 [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone Tejun Heo
  2014-07-12 13:30 ` Shirish Pargaonkar
@ 2014-07-12 15:56 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2014-07-12 15:56 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Shirish Pargaonkar, Sasha Levin, Jet Chen

On 2014-07-06 00:43, Tejun Heo wrote:
> While a queue is being destroyed, all the blkgs are destroyed and its
> ->root_blkg pointer is set to NULL.  If someone else starts to drain
> while the queue is in this state, the following oops happens.
>
>    NULL pointer dereference at 0000000000000028
>    IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
>    PGD e4a1067 PUD b773067 PMD 0
>    Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>    Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
>    CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
>    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>    task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
>    RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
>    RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
>    RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
>    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>    RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
>    R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
>    R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
>    FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>    CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
>    Stack:
>     ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
>     ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
>     ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
>    Call Trace:
>     [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
>     [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
>     [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
>     [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
>     [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
>     [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
>     [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
>     [<ffffffff81454968>] kobject_cleanup+0x38/0x70
>     [<ffffffff81454848>] kobject_put+0x28/0x60
>     [<ffffffff81427505>] blk_put_queue+0x15/0x20
>     [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
>     [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
>     [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
>     [<ffffffff817930e2>] device_release+0x32/0xa0
>     [<ffffffff81454968>] kobject_cleanup+0x38/0x70
>     [<ffffffff81454848>] kobject_put+0x28/0x60
>     [<ffffffff817934d7>] put_device+0x17/0x20
>     [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
>     [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
>     [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
>     [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
>     [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
>     [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
>     [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
>     [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
>     [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b
>
> 776687bce42b ("block, blk-mq: draining can't be skipped even if
> bypass_depth was non-zero") made it easier to trigger this bug by
> making blk_queue_bypass_start() drain even when it loses the first
> bypass test to blk_cleanup_queue(); however, the bug has always been
> there even before the commit as blk_queue_bypass_start() could race
> against queue destruction, win the initial bypass test but perform the
> actual draining after blk_cleanup_queue() already destroyed all blkgs.
>
> Fix it by skippping calling into policy draining if all the blkgs are
> already gone.

Applied, thanks Tejun.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-07-12 15:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-05 22:43 [PATCH block/for-linus] blkcg: don't call into policy draining if root_blkg is already gone Tejun Heo
2014-07-12 13:30 ` Shirish Pargaonkar
2014-07-12 15:56 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.