From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 Date: Mon, 24 Nov 2014 18:27:57 +0100 Message-ID: <54736A9D.3010901@suse.com> References: <546EFAE3.80404@suse.com> <20141121135747.GB2886@laptop.dumpdata.com> <5473008C.4080604@suse.com> <5473147C020000780004A3D5@suse.com> <54730F8F.7080905@suse.com> <54734A2A.9000301@suse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <54734A2A.9000301@suse.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 11/24/2014 04:09 PM, Juergen Gross wrote: > On 11/24/2014 11:59 AM, Juergen Gross wrote: >> On 11/24/2014 11:20 AM, Jan Beulich wrote: >>>>>> On 24.11.14 at 10:55, wrote: >>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu >>>> state via xen debug keys I can see the cpu(s) in question are >>>> spinning >>>> in _raw_spin_lock(): >>>> __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() >>>> The hanging cpus were executing some random user processes (cron, >>>> bash, xargs), cr2 contained user addresses. >>> >>> Is this perhaps what >>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html >>> >>> appears to be about? >> >> Hmm, I'm not sure. >> >> I'll try a 3.17 kernel to verify. > > Still seeing the issue, but less frequent. OTOH I just found in above > thread in lkml that 3.17 is showing that issue, too. :-( > > I'll try to setup a pv-variant of Linus' patch and test it... First test seems to be okay, no immediate NMI message... Any idea why the block-attach/detach would trigger this problem so easily? I can see the dependency on the high cpu count, but fail to do so for the xl actions. Juergen