From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56237) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXuP8-0002Jz-Di for qemu-devel@nongnu.org; Mon, 22 Feb 2016 12:36:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aXuP3-0007A3-AD for qemu-devel@nongnu.org; Mon, 22 Feb 2016 12:36:14 -0500 Received: from mail-wm0-x22c.google.com ([2a00:1450:400c:c09::22c]:35835) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXuP3-00079y-3c for qemu-devel@nongnu.org; Mon, 22 Feb 2016 12:36:09 -0500 Received: by mail-wm0-x22c.google.com with SMTP id c200so183249840wme.0 for ; Mon, 22 Feb 2016 09:36:08 -0800 (PST) Sender: Paolo Bonzini References: <56C8439F.5070901@profihost.ag> From: Paolo Bonzini Message-ID: <56CB4705.1090303@redhat.com> Date: Mon, 22 Feb 2016 18:36:05 +0100 MIME-Version: 1.0 In-Reply-To: <56C8439F.5070901@profihost.ag> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] kernel 4.4.2: kvm_irq_delivery_to_api / rwsem_down_read_failed List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Priebe , qemu-devel , kvm@vger.kernel.org On 20/02/2016 11:44, Stefan Priebe wrote: > Hi, > > while testing Kernel 4.4.2 and starting 20 Qemu 2.4.1 virtual machines. > I got those traces and a load of 500 on those system. I was only abler > to recover by sysrq-trigger. It seems like something happening at the VM level. A task took the mm semaphore and hung everyone else. Difficult to debug without a core (and without knowing who held the semaphore). Sorry. Paolo > All traces: > > INFO: task pvedaemon worke:7470 blocked for more than 120 seconds. > Not tainted 4.4.2+1-ph #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > pvedaemon worke D ffff88239c367ca0 0 7470 7468 0x00080000 > ffff88239c367ca0 ffff8840a6232500 ffff8823ed83a500 ffff88239c367c90 > ffff88239c368000 ffff8845f5f070e8 ffff8845f5f07100 0000000000000000 > 00007ffc73b48e58 ffff88239c367cc0 ffffffffb66a4d89 ffff88239c367cf0 > Call Trace: > [] schedule+0x39/0x80 > [] rwsem_down_read_failed+0xc7/0x120 > [] call_rwsem_down_read_failed+0x14/0x30 > [] ? down_read+0x17/0x20 > [] __access_remote_vm+0x3e/0x1c0 > [] ? call_rwsem_down_read_failed+0x14/0x30 > [] access_remote_vm+0x1f/0x30 > [] proc_pid_cmdline_read+0x16e/0x4f0 > [] ? acct_account_cputime+0x1c/0x20 > [] __vfs_read+0x18/0x40 > [] vfs_read+0x8e/0x140 > [] SyS_read+0x4f/0xa0 > [] entry_SYSCALL_64_fastpath+0x12/0x71 > INFO: task pvestatd:7633 blocked for more than 120 seconds. > Not tainted 4.4.2+1-ph #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > pvestatd D ffff88239f16fd40 0 7633 1 0x00080000 > ffff88239f16fd40 ffff8824e76a8000 ffff8823e5fc2500 ffff8823e5fc2500 > ffff88239f170000 ffff8845f5f070e8 ffff8845f5f07100 ffff8845f5f07080 > 000000000341bf10 ffff88239f16fd60 ffffffffb66a4d89 024000d000000058 > Call Trace: > [] schedule+0x39/0x80 > [] rwsem_down_read_failed+0xc7/0x120 > [] call_rwsem_down_read_failed+0x14/0x30 > [] ? down_read+0x17/0x20 > [] proc_pid_cmdline_read+0xac/0x4f0 > [] ? acct_account_cputime+0x1c/0x20 > [] ? account_user_time+0x73/0x80 > [] ? vtime_account_user+0x4e/0x70 > [] __vfs_read+0x18/0x40 > [] vfs_read+0x8e/0x140 > [] SyS_read+0x4f/0xa0 > [] entry_SYSCALL_64_fastpath+0x12/0x71