public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Inconsistent lock state with Hyper-V memory balloon?
@ 2014-11-08 14:36 Sitsofe Wheeler
  2014-11-10  9:44 ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Sitsofe Wheeler @ 2014-11-08 14:36 UTC (permalink / raw)
  To: K. Y. Srinivasan
  Cc: Haiyang Zhang, devel, Peter Zijlstra, Ingo Molnar, linux-kernel

I've been trying to use the Hyper-V balloon driver to allow the host to
reclaim unused memory but have been hitting issues. With a Hyper-V 2012
R2 guest with 4GBytes of RAM, dynamic memory on, 1GByte minimum 10GByte
maximum, 8 vcpus, running a 3.18.0-rc3 kernel with no swap configured
the following lockdep splat occurred:

=================================
[ INFO: inconsistent lock state ]
3.18.0-rc3.x86_64 #159 Not tainted
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (bdev_lock){+.?...}, at: [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
{SOFTIRQ-ON-W} state was registered at:
  [<ffffffff810ba03d>] __lock_acquire+0x87d/0x1c60
  [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
  [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
  [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
  [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
  [<ffffffff81d6622f>] eventpoll_init+0x11/0x10a
  [<ffffffff81d3d150>] do_one_initcall+0xf9/0x1a7
  [<ffffffff81d3d3d2>] kernel_init_freeable+0x1d4/0x268
  [<ffffffff816ce0ae>] kernel_init+0xe/0x100
  [<ffffffff816e4dfc>] ret_from_fork+0x7c/0xb0
irq event stamp: 2660283708
hardirqs last  enabled at (2660283708): [<ffffffff8115eef5>] free_hot_cold_page+0x175/0x190
hardirqs last disabled at (2660283707): [<ffffffff8115ee25>] free_hot_cold_page+0xa5/0x190
softirqs last  enabled at (2660132034): [<ffffffff81071e6a>] _local_bh_enable+0x4a/0x50
softirqs last disabled at (2660132035): [<ffffffff81072478>] irq_exit+0x58/0xc0

might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(bdev_lock);
  <Interrupt>
    lock(bdev_lock);

*

no locks held by swapper/0/0.


CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3.x86_64 #159
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
 ffffffff8266ac90 ffff880107403af8 ffffffff816db3ef 0000000000000000
 ffffffff81c134c0 ffff880107403b58 ffffffff816d6fd3 0000000000000001
 ffffffff00000001 ffff880100000000 ffffffff81010e6f 0000000000000046
Call Trace:
 <IRQ>  [<ffffffff816db3ef>] dump_stack+0x4e/0x68
 [<ffffffff816d6fd3>] print_usage_bug+0x1f3/0x204
 [<ffffffff81010e6f>] ? save_stack_trace+0x2f/0x50
 [<ffffffff810b6f40>] ? print_irq_inversion_bug+0x200/0x200
 [<ffffffff810b78e6>] mark_lock+0x176/0x2e0
 [<ffffffff810b9f83>] __lock_acquire+0x7c3/0x1c60
 [<ffffffff8103d548>] ? lookup_address+0x28/0x30
 [<ffffffff8103d58b>] ? _lookup_address_cpa.isra.3+0x3b/0x40
 [<ffffffff813c4e89>] ? __debug_check_no_obj_freed+0x89/0x220
 [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
 [<ffffffff811ff39c>] ? nr_blockdev_pages+0x1c/0x80
 [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
 [<ffffffff811ff39c>] ? nr_blockdev_pages+0x1c/0x80
 [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
 [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
 [<ffffffff815eb14d>] post_status.isra.3+0x6d/0x190
 [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8115f00f>] ? __free_pages+0x2f/0x60
 [<ffffffff815eb34f>] ? free_balloon_pages.isra.5+0x8f/0xb0
 [<ffffffff815eb972>] balloon_onchannelcallback+0x212/0x380
 [<ffffffff815e69d3>] vmbus_on_event+0x173/0x1d0
 [<ffffffff81071b47>] tasklet_action+0x127/0x160
 [<ffffffff81071ffa>] __do_softirq+0x18a/0x340
 [<ffffffff81072478>] irq_exit+0x58/0xc0
 [<ffffffff810290c5>] hyperv_vector_handler+0x45/0x60
 [<ffffffff816e6b92>] hyperv_callback_vector+0x72/0x80
 <EOI>  [<ffffffff81037b76>] ? native_safe_halt+0x6/0x10
 [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8100c8d1>] default_idle+0x51/0xf0
 [<ffffffff8100d30f>] arch_cpu_idle+0xf/0x20
 [<ffffffff810b01e7>] cpu_startup_entry+0x217/0x3f0
 [<ffffffff816ce099>] rest_init+0xc9/0xd0
 [<ffffffff816cdfd5>] ? rest_init+0x5/0xd0
 [<ffffffff81d3d04a>] start_kernel+0x438/0x445
 [<ffffffff81d3c94a>] ? set_init_arg+0x57/0x57
 [<ffffffff81d3c120>] ? early_idt_handlers+0x120/0x120
 [<ffffffff81d3c59f>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff81d3c6df>] x86_64_start_kernel+0x13e/0x14d

Any help deciphering the above is greatly appreciated!

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inconsistent lock state with Hyper-V memory balloon?
  2014-11-08 14:36 Inconsistent lock state with Hyper-V memory balloon? Sitsofe Wheeler
@ 2014-11-10  9:44 ` Peter Zijlstra
  2014-11-10  9:47   ` KY Srinivasan
  2014-11-12  3:05   ` Long Li
  0 siblings, 2 replies; 4+ messages in thread
From: Peter Zijlstra @ 2014-11-10  9:44 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: K. Y. Srinivasan, Haiyang Zhang, devel, Ingo Molnar, linux-kernel

On Sat, Nov 08, 2014 at 02:36:54PM +0000, Sitsofe Wheeler wrote:
> I've been trying to use the Hyper-V balloon driver to allow the host to
> reclaim unused memory but have been hitting issues. With a Hyper-V 2012
> R2 guest with 4GBytes of RAM, dynamic memory on, 1GByte minimum 10GByte
> maximum, 8 vcpus, running a 3.18.0-rc3 kernel with no swap configured
> the following lockdep splat occurred:
> 
> =================================
> [ INFO: inconsistent lock state ]
> 3.18.0-rc3.x86_64 #159 Not tainted
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (bdev_lock){+.?...}, at: [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
> {SOFTIRQ-ON-W} state was registered at:
>   [<ffffffff810ba03d>] __lock_acquire+0x87d/0x1c60
>   [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
>   [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
>   [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
>   [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
>   [<ffffffff81d6622f>] eventpoll_init+0x11/0x10a
>   [<ffffffff81d3d150>] do_one_initcall+0xf9/0x1a7
>   [<ffffffff81d3d3d2>] kernel_init_freeable+0x1d4/0x268
>   [<ffffffff816ce0ae>] kernel_init+0xe/0x100
>   [<ffffffff816e4dfc>] ret_from_fork+0x7c/0xb0
> irq event stamp: 2660283708
> hardirqs last  enabled at (2660283708): [<ffffffff8115eef5>] free_hot_cold_page+0x175/0x190
> hardirqs last disabled at (2660283707): [<ffffffff8115ee25>] free_hot_cold_page+0xa5/0x190
> softirqs last  enabled at (2660132034): [<ffffffff81071e6a>] _local_bh_enable+0x4a/0x50
> softirqs last disabled at (2660132035): [<ffffffff81072478>] irq_exit+0x58/0xc0
> 
> might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(bdev_lock);
>   <Interrupt>
>     lock(bdev_lock);
> 
> *
> 
> no locks held by swapper/0/0.
> 
> 
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3.x86_64 #159
> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
>  ffffffff8266ac90 ffff880107403af8 ffffffff816db3ef 0000000000000000
>  ffffffff81c134c0 ffff880107403b58 ffffffff816d6fd3 0000000000000001
>  ffffffff00000001 ffff880100000000 ffffffff81010e6f 0000000000000046
> Call Trace:
>  <IRQ>  [<ffffffff816db3ef>] dump_stack+0x4e/0x68
>  [<ffffffff816d6fd3>] print_usage_bug+0x1f3/0x204
>  [<ffffffff81010e6f>] ? save_stack_trace+0x2f/0x50
>  [<ffffffff810b6f40>] ? print_irq_inversion_bug+0x200/0x200
>  [<ffffffff810b78e6>] mark_lock+0x176/0x2e0
>  [<ffffffff810b9f83>] __lock_acquire+0x7c3/0x1c60
>  [<ffffffff8103d548>] ? lookup_address+0x28/0x30
>  [<ffffffff8103d58b>] ? _lookup_address_cpa.isra.3+0x3b/0x40
>  [<ffffffff813c4e89>] ? __debug_check_no_obj_freed+0x89/0x220
>  [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
>  [<ffffffff811ff39c>] ? nr_blockdev_pages+0x1c/0x80
>  [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
>  [<ffffffff811ff39c>] ? nr_blockdev_pages+0x1c/0x80
>  [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
>  [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
>  [<ffffffff815eb14d>] post_status.isra.3+0x6d/0x190
>  [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff8115f00f>] ? __free_pages+0x2f/0x60
>  [<ffffffff815eb34f>] ? free_balloon_pages.isra.5+0x8f/0xb0
>  [<ffffffff815eb972>] balloon_onchannelcallback+0x212/0x380
>  [<ffffffff815e69d3>] vmbus_on_event+0x173/0x1d0
>  [<ffffffff81071b47>] tasklet_action+0x127/0x160
>  [<ffffffff81071ffa>] __do_softirq+0x18a/0x340
>  [<ffffffff81072478>] irq_exit+0x58/0xc0
>  [<ffffffff810290c5>] hyperv_vector_handler+0x45/0x60
>  [<ffffffff816e6b92>] hyperv_callback_vector+0x72/0x80
>  <EOI>  [<ffffffff81037b76>] ? native_safe_halt+0x6/0x10
>  [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff8100c8d1>] default_idle+0x51/0xf0
>  [<ffffffff8100d30f>] arch_cpu_idle+0xf/0x20
>  [<ffffffff810b01e7>] cpu_startup_entry+0x217/0x3f0
>  [<ffffffff816ce099>] rest_init+0xc9/0xd0
>  [<ffffffff816cdfd5>] ? rest_init+0x5/0xd0
>  [<ffffffff81d3d04a>] start_kernel+0x438/0x445
>  [<ffffffff81d3c94a>] ? set_init_arg+0x57/0x57
>  [<ffffffff81d3c120>] ? early_idt_handlers+0x120/0x120
>  [<ffffffff81d3c59f>] x86_64_start_reservations+0x2a/0x2c
>  [<ffffffff81d3c6df>] x86_64_start_kernel+0x13e/0x14d
> 
> Any help deciphering the above is greatly appreciated!

Its fairly simple, the first trace shows where bdev_lock was taken with
softirqs enabled, and the second trace shows where its taken from
softirqs. Combine the two and you've got a recursive deadlock.

I don't know the block layer very well, but a quick glance at the code
shows its bdev_lock isn't meant to be used from softirq context,
therefore the hyperv stuff is broken.

So complain to the hyperv people.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Inconsistent lock state with Hyper-V memory balloon?
  2014-11-10  9:44 ` Peter Zijlstra
@ 2014-11-10  9:47   ` KY Srinivasan
  2014-11-12  3:05   ` Long Li
  1 sibling, 0 replies; 4+ messages in thread
From: KY Srinivasan @ 2014-11-10  9:47 UTC (permalink / raw)
  To: Peter Zijlstra, Sitsofe Wheeler
  Cc: Haiyang Zhang, devel@linuxdriverproject.org, Ingo Molnar,
	linux-kernel@vger.kernel.org



> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@infradead.org]
> Sent: Monday, November 10, 2014 1:44 AM
> To: Sitsofe Wheeler
> Cc: KY Srinivasan; Haiyang Zhang; devel@linuxdriverproject.org; Ingo Molnar;
> linux-kernel@vger.kernel.org
> Subject: Re: Inconsistent lock state with Hyper-V memory balloon?
> 
> On Sat, Nov 08, 2014 at 02:36:54PM +0000, Sitsofe Wheeler wrote:
> > I've been trying to use the Hyper-V balloon driver to allow the host
> > to reclaim unused memory but have been hitting issues. With a Hyper-V
> > 2012
> > R2 guest with 4GBytes of RAM, dynamic memory on, 1GByte minimum
> > 10GByte maximum, 8 vcpus, running a 3.18.0-rc3 kernel with no swap
> > configured the following lockdep splat occurred:
> >
> > =================================
> > [ INFO: inconsistent lock state ]
> > 3.18.0-rc3.x86_64 #159 Not tainted
> > ---------------------------------
> > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> >  (bdev_lock){+.?...}, at: [<ffffffff811ff39c>]
> > nr_blockdev_pages+0x1c/0x80 {SOFTIRQ-ON-W} state was registered at:
> >   [<ffffffff810ba03d>] __lock_acquire+0x87d/0x1c60
> >   [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
> >   [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
> >   [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
> >   [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
> >   [<ffffffff81d6622f>] eventpoll_init+0x11/0x10a
> >   [<ffffffff81d3d150>] do_one_initcall+0xf9/0x1a7
> >   [<ffffffff81d3d3d2>] kernel_init_freeable+0x1d4/0x268
> >   [<ffffffff816ce0ae>] kernel_init+0xe/0x100
> >   [<ffffffff816e4dfc>] ret_from_fork+0x7c/0xb0 irq event stamp:
> > 2660283708 hardirqs last  enabled at (2660283708):
> > [<ffffffff8115eef5>] free_hot_cold_page+0x175/0x190 hardirqs last
> > disabled at (2660283707): [<ffffffff8115ee25>]
> > free_hot_cold_page+0xa5/0x190 softirqs last  enabled at (2660132034):
> > [<ffffffff81071e6a>] _local_bh_enable+0x4a/0x50 softirqs last disabled
> > at (2660132035): [<ffffffff81072478>] irq_exit+0x58/0xc0
> >
> > might help us debug this:
> >  Possible unsafe locking scenario:
> >
> >        CPU0
> >        ----
> >   lock(bdev_lock);
> >   <Interrupt>
> >     lock(bdev_lock);
> >
> > *
> >
> > no locks held by swapper/0/0.
> >
> >
> > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3.x86_64 #159
> > Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine,
> > BIOS 090006  05/23/2012
> >  ffffffff8266ac90 ffff880107403af8 ffffffff816db3ef 0000000000000000
> >  ffffffff81c134c0 ffff880107403b58 ffffffff816d6fd3 0000000000000001
> >  ffffffff00000001 ffff880100000000 ffffffff81010e6f 0000000000000046
> > Call Trace:
> >  <IRQ>  [<ffffffff816db3ef>] dump_stack+0x4e/0x68
> > [<ffffffff816d6fd3>] print_usage_bug+0x1f3/0x204  [<ffffffff81010e6f>]
> > ? save_stack_trace+0x2f/0x50  [<ffffffff810b6f40>] ?
> > print_irq_inversion_bug+0x200/0x200
> >  [<ffffffff810b78e6>] mark_lock+0x176/0x2e0  [<ffffffff810b9f83>]
> > __lock_acquire+0x7c3/0x1c60  [<ffffffff8103d548>] ?
> > lookup_address+0x28/0x30  [<ffffffff8103d58b>] ?
> > _lookup_address_cpa.isra.3+0x3b/0x40
> >  [<ffffffff813c4e89>] ? __debug_check_no_obj_freed+0x89/0x220
> >  [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150  [<ffffffff811ff39c>] ?
> > nr_blockdev_pages+0x1c/0x80  [<ffffffff816e4019>]
> > _raw_spin_lock+0x39/0x50  [<ffffffff811ff39c>] ?
> > nr_blockdev_pages+0x1c/0x80  [<ffffffff811ff39c>]
> > nr_blockdev_pages+0x1c/0x80  [<ffffffff8115d5f7>]
> si_meminfo+0x47/0x70
> > [<ffffffff815eb14d>] post_status.isra.3+0x6d/0x190
> > [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10
> > [<ffffffff8115f00f>] ? __free_pages+0x2f/0x60  [<ffffffff815eb34f>] ?
> > free_balloon_pages.isra.5+0x8f/0xb0
> >  [<ffffffff815eb972>] balloon_onchannelcallback+0x212/0x380
> >  [<ffffffff815e69d3>] vmbus_on_event+0x173/0x1d0  [<ffffffff81071b47>]
> > tasklet_action+0x127/0x160  [<ffffffff81071ffa>]
> > __do_softirq+0x18a/0x340  [<ffffffff81072478>] irq_exit+0x58/0xc0
> > [<ffffffff810290c5>] hyperv_vector_handler+0x45/0x60
> > [<ffffffff816e6b92>] hyperv_callback_vector+0x72/0x80  <EOI>
> > [<ffffffff81037b76>] ? native_safe_halt+0x6/0x10  [<ffffffff810b7f4d>]
> > ? trace_hardirqs_on+0xd/0x10  [<ffffffff8100c8d1>]
> > default_idle+0x51/0xf0  [<ffffffff8100d30f>] arch_cpu_idle+0xf/0x20
> > [<ffffffff810b01e7>] cpu_startup_entry+0x217/0x3f0
> > [<ffffffff816ce099>] rest_init+0xc9/0xd0  [<ffffffff816cdfd5>] ?
> > rest_init+0x5/0xd0  [<ffffffff81d3d04a>] start_kernel+0x438/0x445
> > [<ffffffff81d3c94a>] ? set_init_arg+0x57/0x57  [<ffffffff81d3c120>] ?
> > early_idt_handlers+0x120/0x120  [<ffffffff81d3c59f>]
> > x86_64_start_reservations+0x2a/0x2c
> >  [<ffffffff81d3c6df>] x86_64_start_kernel+0x13e/0x14d
> >
> > Any help deciphering the above is greatly appreciated!
> 
> Its fairly simple, the first trace shows where bdev_lock was taken with
> softirqs enabled, and the second trace shows where its taken from softirqs.
> Combine the two and you've got a recursive deadlock.
> 
> I don't know the block layer very well, but a quick glance at the code shows
> its bdev_lock isn't meant to be used from softirq context, therefore the
> hyperv stuff is broken.
> 
> So complain to the hyperv people.

Sitsofe,

I am on vacation in India till the the 17th of November. I will look at this and fix this issue when I get back.

Regards,

K. Y

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Inconsistent lock state with Hyper-V memory balloon?
  2014-11-10  9:44 ` Peter Zijlstra
  2014-11-10  9:47   ` KY Srinivasan
@ 2014-11-12  3:05   ` Long Li
  1 sibling, 0 replies; 4+ messages in thread
From: Long Li @ 2014-11-12  3:05 UTC (permalink / raw)
  To: Peter Zijlstra, Sitsofe Wheeler
  Cc: KY Srinivasan, Haiyang Zhang, devel@linuxdriverproject.org,
	Ingo Molnar, linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 5588 bytes --]

Sitsofe, can you try the patch attached to see if it helps with the problem? 

Long

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Peter Zijlstra
Sent: Monday, November 10, 2014 1:44 AM
To: Sitsofe Wheeler
Cc: KY Srinivasan; Haiyang Zhang; devel@linuxdriverproject.org; Ingo Molnar; linux-kernel@vger.kernel.org
Subject: Re: Inconsistent lock state with Hyper-V memory balloon?

On Sat, Nov 08, 2014 at 02:36:54PM +0000, Sitsofe Wheeler wrote:
> I've been trying to use the Hyper-V balloon driver to allow the host 
> to reclaim unused memory but have been hitting issues. With a Hyper-V 
> 2012
> R2 guest with 4GBytes of RAM, dynamic memory on, 1GByte minimum 
> 10GByte maximum, 8 vcpus, running a 3.18.0-rc3 kernel with no swap 
> configured the following lockdep splat occurred:
> 
> =================================
> [ INFO: inconsistent lock state ]
> 3.18.0-rc3.x86_64 #159 Not tainted
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (bdev_lock){+.?...}, at: [<ffffffff811ff39c>] 
> nr_blockdev_pages+0x1c/0x80 {SOFTIRQ-ON-W} state was registered at:
>   [<ffffffff810ba03d>] __lock_acquire+0x87d/0x1c60
>   [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150
>   [<ffffffff816e4019>] _raw_spin_lock+0x39/0x50
>   [<ffffffff811ff39c>] nr_blockdev_pages+0x1c/0x80
>   [<ffffffff8115d5f7>] si_meminfo+0x47/0x70
>   [<ffffffff81d6622f>] eventpoll_init+0x11/0x10a
>   [<ffffffff81d3d150>] do_one_initcall+0xf9/0x1a7
>   [<ffffffff81d3d3d2>] kernel_init_freeable+0x1d4/0x268
>   [<ffffffff816ce0ae>] kernel_init+0xe/0x100
>   [<ffffffff816e4dfc>] ret_from_fork+0x7c/0xb0 irq event stamp: 
> 2660283708 hardirqs last  enabled at (2660283708): 
> [<ffffffff8115eef5>] free_hot_cold_page+0x175/0x190 hardirqs last 
> disabled at (2660283707): [<ffffffff8115ee25>] 
> free_hot_cold_page+0xa5/0x190 softirqs last  enabled at (2660132034): 
> [<ffffffff81071e6a>] _local_bh_enable+0x4a/0x50 softirqs last disabled 
> at (2660132035): [<ffffffff81072478>] irq_exit+0x58/0xc0
> 
> might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(bdev_lock);
>   <Interrupt>
>     lock(bdev_lock);
> 
> *
> 
> no locks held by swapper/0/0.
> 
> 
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3.x86_64 #159 
> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, 
> BIOS 090006  05/23/2012
>  ffffffff8266ac90 ffff880107403af8 ffffffff816db3ef 0000000000000000
>  ffffffff81c134c0 ffff880107403b58 ffffffff816d6fd3 0000000000000001
>  ffffffff00000001 ffff880100000000 ffffffff81010e6f 0000000000000046 
> Call Trace:
>  <IRQ>  [<ffffffff816db3ef>] dump_stack+0x4e/0x68  
> [<ffffffff816d6fd3>] print_usage_bug+0x1f3/0x204  [<ffffffff81010e6f>] 
> ? save_stack_trace+0x2f/0x50  [<ffffffff810b6f40>] ? 
> print_irq_inversion_bug+0x200/0x200
>  [<ffffffff810b78e6>] mark_lock+0x176/0x2e0  [<ffffffff810b9f83>] 
> __lock_acquire+0x7c3/0x1c60  [<ffffffff8103d548>] ? 
> lookup_address+0x28/0x30  [<ffffffff8103d58b>] ? 
> _lookup_address_cpa.isra.3+0x3b/0x40
>  [<ffffffff813c4e89>] ? __debug_check_no_obj_freed+0x89/0x220
>  [<ffffffff810bbcdc>] lock_acquire+0xfc/0x150  [<ffffffff811ff39c>] ? 
> nr_blockdev_pages+0x1c/0x80  [<ffffffff816e4019>] 
> _raw_spin_lock+0x39/0x50  [<ffffffff811ff39c>] ? 
> nr_blockdev_pages+0x1c/0x80  [<ffffffff811ff39c>] 
> nr_blockdev_pages+0x1c/0x80  [<ffffffff8115d5f7>] si_meminfo+0x47/0x70  
> [<ffffffff815eb14d>] post_status.isra.3+0x6d/0x190  
> [<ffffffff810b7f4d>] ? trace_hardirqs_on+0xd/0x10  
> [<ffffffff8115f00f>] ? __free_pages+0x2f/0x60  [<ffffffff815eb34f>] ? 
> free_balloon_pages.isra.5+0x8f/0xb0
>  [<ffffffff815eb972>] balloon_onchannelcallback+0x212/0x380
>  [<ffffffff815e69d3>] vmbus_on_event+0x173/0x1d0  [<ffffffff81071b47>] 
> tasklet_action+0x127/0x160  [<ffffffff81071ffa>] 
> __do_softirq+0x18a/0x340  [<ffffffff81072478>] irq_exit+0x58/0xc0  
> [<ffffffff810290c5>] hyperv_vector_handler+0x45/0x60  
> [<ffffffff816e6b92>] hyperv_callback_vector+0x72/0x80  <EOI>  
> [<ffffffff81037b76>] ? native_safe_halt+0x6/0x10  [<ffffffff810b7f4d>] 
> ? trace_hardirqs_on+0xd/0x10  [<ffffffff8100c8d1>] 
> default_idle+0x51/0xf0  [<ffffffff8100d30f>] arch_cpu_idle+0xf/0x20  
> [<ffffffff810b01e7>] cpu_startup_entry+0x217/0x3f0  
> [<ffffffff816ce099>] rest_init+0xc9/0xd0  [<ffffffff816cdfd5>] ? 
> rest_init+0x5/0xd0  [<ffffffff81d3d04a>] start_kernel+0x438/0x445  
> [<ffffffff81d3c94a>] ? set_init_arg+0x57/0x57  [<ffffffff81d3c120>] ? 
> early_idt_handlers+0x120/0x120  [<ffffffff81d3c59f>] 
> x86_64_start_reservations+0x2a/0x2c
>  [<ffffffff81d3c6df>] x86_64_start_kernel+0x13e/0x14d
> 
> Any help deciphering the above is greatly appreciated!

Its fairly simple, the first trace shows where bdev_lock was taken with softirqs enabled, and the second trace shows where its taken from softirqs. Combine the two and you've got a recursive deadlock.

I don't know the block layer very well, but a quick glance at the code shows its bdev_lock isn't meant to be used from softirq context, therefore the hyperv stuff is broken.

So complain to the hyperv people.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[-- Attachment #2: 0001-Move-unballoon-to-work-queue.patch --]
[-- Type: application/octet-stream, Size: 2587 bytes --]

From 38e25c0f7e9e390a0eacd48e96020117268388f3 Mon Sep 17 00:00:00 2001
From: Long Li <longli@microsoft.com>
Date: Tue, 11 Nov 2014 17:10:20 +0000
Subject: [PATCH] Move unballoon to work queue

---
 drivers/hv/hv_balloon.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 5e90c5d..d5b11d5 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -509,6 +509,9 @@ struct hv_dynmem_device {
 	 */
 	struct balloon_state balloon_wrk;
 
+	struct dm_unballoon_request *unballoon_req;
+	struct balloon_state unballoon_wrk;
+
 	/*
 	 * State to execute the "hot-add" operation.
 	 */
@@ -1157,16 +1160,16 @@ static void balloon_up(struct work_struct *dummy)
 
 }
 
-static void balloon_down(struct hv_dynmem_device *dm,
-			struct dm_unballoon_request *req)
+static void balloon_down(struct work_struct *dummy)
 {
+	struct dm_unballoon_request *req = dm_device.unballoon_req;
 	union dm_mem_page_range *range_array = req->range_array;
 	int range_count = req->range_count;
 	struct dm_unballoon_response resp;
 	int i;
 
 	for (i = 0; i < range_count; i++) {
-		free_balloon_pages(dm, &range_array[i]);
+		free_balloon_pages(&dm_device, &range_array[i]);
 		post_status(&dm_device);
 	}
 
@@ -1183,7 +1186,7 @@ static void balloon_down(struct hv_dynmem_device *dm,
 				(unsigned long)NULL,
 				VM_PKT_DATA_INBAND, 0);
 
-	dm->state = DM_INITIALIZED;
+	dm_device.state = DM_INITIALIZED;
 }
 
 static void balloon_onchannelcallback(void *context);
@@ -1311,8 +1314,8 @@ static void balloon_onchannelcallback(void *context)
 
 		case DM_UNBALLOON_REQUEST:
 			dm->state = DM_BALLOON_DOWN;
-			balloon_down(dm,
-				 (struct dm_unballoon_request *)recv_buffer);
+			dm->unballoon_req = (struct dm_unballoon_request *)recv_buffer;
+			schedule_work(&dm_device.unballoon_wrk.wrk);
 			break;
 
 		case DM_MEM_HOT_ADD_REQUEST:
@@ -1385,6 +1388,7 @@ static int balloon_probe(struct hv_device *dev,
 	init_completion(&dm_device.config_event);
 	INIT_LIST_HEAD(&dm_device.ha_region_list);
 	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
+	INIT_WORK(&dm_device.unballoon_wrk.wrk, balloon_down);
 	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
 	dm_device.host_specified_ha_region = false;
 
@@ -1508,6 +1512,7 @@ static int balloon_remove(struct hv_device *dev)
 		pr_warn("Ballooned pages: %d\n", dm->num_pages_ballooned);
 
 	cancel_work_sync(&dm->balloon_wrk.wrk);
+	cancel_work_sync(&dm->unballoon_wrk.wrk);
 	cancel_work_sync(&dm->ha_wrk.wrk);
 
 	vmbus_close(dev->channel);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-12  3:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-08 14:36 Inconsistent lock state with Hyper-V memory balloon? Sitsofe Wheeler
2014-11-10  9:44 ` Peter Zijlstra
2014-11-10  9:47   ` KY Srinivasan
2014-11-12  3:05   ` Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox