linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Nate Custer <nate@cpanel.net>,
	kvm@vger.kernel.org, linux-kernel <linux-kernel@vger.kernel.org>,
	Vivek Goyal <vgoyal@redhat.com>
Subject: Re: kvm deadlock
Date: Wed, 14 Dec 2011 17:03:54 +0100	[thread overview]
Message-ID: <4EE8C8EA.9070207@kernel.dk> (raw)
In-Reply-To: <4EE8A7ED.7060703@redhat.com>

On 2011-12-14 14:43, Avi Kivity wrote:
> On 12/14/2011 02:25 PM, Marcelo Tosatti wrote:
>> On Mon, Dec 05, 2011 at 04:48:16PM -0600, Nate Custer wrote:
>>> Hello,
>>>
>>> I am struggling with repeatable full hardware locks when running 8-12 KVM vms. At some point before the hard lock I get a inconsistent lock state warning. An example of this can be found here:
>>>
>>> http://pastebin.com/8wKhgE2C
>>>
>>> After that the server continues to run for a while and then starts its death spiral. When it reaches that point it fails to log anything further to the disk, but by attaching a console I have been able to get a stack trace documenting the final implosion:
>>>
>>> http://pastebin.com/PbcN76bd
>>>
>>> All of the cores end up hung and the server stops responding to all input, including SysRq commands. 
>>>
>>> I have seen this behavior on two machines (dual E5606 running Fedora 16) both passed cpuburnin testing and memtest86 scans without error. 
>>>
>>> I have reproduced the crash and stack traces from a Fedora debugging kernel - 3.1.2-1 and with a vanilla 3.1.4 kernel.
>>
>> Busted hardware, apparently. Can you reproduce these issues with the
>> same workload on different hardware?
> 
> I don't think it's hardware related.  The second trace (in the first
> paste) is called during swap, so GFP_FS is set.  The first one is not,
> so GFP_FS is clear.  Lockdep is worried about the following scenario:
> 
>   acpi_early_init() is called
>   calls pcpu_alloc(), which takes pcpu_alloc_mutex
>   eventually, calls kmalloc(), or some other allocation function
>   no memory, so swap
>   call try_to_free_pages()
>   submit_bio()
>   blk_throtl_bio()
>   blkio_alloc_blkg_stats()
>   alloc_percpu()
>   pcpu_alloc(), which takes pcpu_alloc_mutex
>   deadlock
> 
> It's a little unlikely that acpi_early_init() will OOM, but lockdep
> doesn't know that.  Other callers of pcpu_alloc() could trigger the same
> thing.
> 
> When lockdep says
> 
> [ 5839.924953] other info that might help us debug this:
> [ 5839.925396]  Possible unsafe locking scenario:
> [ 5839.925397]
> [ 5839.925840]        CPU0
> [ 5839.926063]        ----
> [ 5839.926287]   lock(pcpu_alloc_mutex);
> [ 5839.926533]   <Interrupt>
> [ 5839.926756]     lock(pcpu_alloc_mutex);
> [ 5839.926986]
> 
> It really means
> 
>    <swap, set GFP_FS>
> 
> GFP_FS simply marks the beginning of a nested, unrelated context that
> uses the same thread, just like an interrupt.  Kudos to lockdep for
> catching that.
> 
> I think the allocation in blkio_alloc_blkg_stats() should be moved out
> of the I/O path into some init function. Copying Jens.

That's completely buggy, basically you end up with a GFP_KERNEL
allocation from the IO submit path. Vivek, per_cpu data needs to be set
up at init time. You can't allocate it dynamically off the IO path.

-- 
Jens Axboe


  parent reply	other threads:[~2011-12-14 16:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <54FC5923-2123-4BDD-A506-EA57DCE0C1F6@cpanel.net>
     [not found] ` <20111214122511.GD18317@amt.cnet>
2011-12-14 13:43   ` kvm deadlock Avi Kivity
2011-12-14 14:00     ` Marcelo Tosatti
2011-12-14 14:02       ` Avi Kivity
2011-12-14 14:06         ` Marcelo Tosatti
2011-12-14 14:17           ` Nate Custer
2011-12-14 14:20             ` Marcelo Tosatti
2011-12-14 14:28             ` Avi Kivity
2011-12-14 14:27           ` Avi Kivity
2011-12-14 16:03     ` Jens Axboe [this message]
2011-12-14 17:03       ` Vivek Goyal
2011-12-14 17:09         ` Jens Axboe
2011-12-14 17:22           ` Vivek Goyal
2011-12-14 18:16             ` Tejun Heo
2011-12-14 18:41               ` Vivek Goyal
2011-12-14 23:06                 ` Vivek Goyal
2011-12-15 19:47       ` [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock) Vivek Goyal
     [not found]         ` <E73DB38E-AFC5-445D-9E76-DE599B36A814@cpanel.net>
2011-12-16 20:29           ` Vivek Goyal
2011-12-18 21:25             ` Nate Custer
2011-12-19 13:40               ` Vivek Goyal
2011-12-19 17:27               ` Vivek Goyal
2011-12-19 17:35                 ` Tejun Heo
2011-12-19 18:27                   ` Vivek Goyal
2011-12-19 22:56                     ` Tejun Heo
2011-12-20 14:50                       ` Vivek Goyal
2011-12-20 20:45                         ` Tejun Heo
2011-12-20 12:49                     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE8C8EA.9070207@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nate@cpanel.net \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).