From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35712) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bIZGZ-0001TH-Tc for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:32:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bIZGU-0004Hq-0O for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:32:14 -0400 Received: from mail-wm0-x22c.google.com ([2a00:1450:400c:c09::22c]:38149) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bIZGT-0004Hk-G9 for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:32:09 -0400 Received: by mail-wm0-x22c.google.com with SMTP id r201so111959160wme.1 for ; Thu, 30 Jun 2016 03:32:09 -0700 (PDT) References: <1466375313-7562-1-git-send-email-sergey.fedorov@linaro.org> <1466375313-7562-7-git-send-email-sergey.fedorov@linaro.org> <87lh1o0y1k.fsf@linaro.org> <5774E8C2.1050506@gmail.com> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <5774E8C2.1050506@gmail.com> Date: Thu, 30 Jun 2016 11:32:12 +0100 Message-ID: <87furvq85v.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC 6/8] linux-user: Support CPU work queue List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Fedorov Cc: Sergey Fedorov , qemu-devel@nongnu.org, Riku Voipio , Peter Crosthwaite , patches@linaro.org, Paolo Bonzini , Richard Henderson Sergey Fedorov writes: > On 29/06/16 19:17, Alex Bennée wrote: >> So I think there is a deadlock we can get with the async work: >> >> (gdb) thread apply all bt >> >> Thread 11 (Thread 0x7ffefeca7700 (LWP 2912)): >> #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 >> #1 0x00005555555cb777 in wait_cpu_work () at /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:155 >> #2 0x00005555555a0cee in wait_safe_cpu_work () at /home/alex/lsrc/qemu/qemu.git/cpu-exec-common.c:87 >> #3 0x00005555555cb8fe in cpu_exec_end (cpu=0x555555bb67e0) at /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:222 >> #4 0x00005555555cc7a7 in cpu_loop (env=0x555555bbea58) at /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:749 >> #5 0x00005555555db0b2 in clone_func (arg=0x7fffffffc9c0) at /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:5424 >> #6 0x00007ffff6bed6fa in start_thread (arg=0x7ffefeca7700) at pthread_create.c:333 >> #7 0x00007ffff6923b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 >> >> >> >> Thread 3 (Thread 0x7ffff7f38700 (LWP 2904)): >> #0 0x00005555555faf5d in safe_syscall_base () >> #1 0x00005555555cfeaf in safe_futex (uaddr=0x7ffff528a0a4, op=128, val=1, timeout=0x0, uaddr2=0x0, val3=-162668384) >> at /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:706 >> #2 0x00005555555dd7cc in do_futex (uaddr=4132298916, op=128, val=1, timeout=0, uaddr2=0, val3=-162668384) >> at /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6246 >> #3 0x00005555555e8cdb in do_syscall (cpu_env=0x555555a81118, num=240, arg1=-162668380, arg2=128, arg3=1, arg4=0, arg5=0, arg6=-162668384, >> arg7=0, arg8=0) at /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:10642 >> #4 0x00005555555cd20e in cpu_loop (env=0x555555a81118) at /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:883 >> #5 0x00005555555db0b2 in clone_func (arg=0x7fffffffc9c0) at /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:5424 >> #6 0x00007ffff6bed6fa in start_thread (arg=0x7ffff7f38700) at pthread_create.c:333 >> #7 0x00007ffff6923b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 >> >> So everything is stalled awaiting this thread waking up and draining >> its queue. So for linux-user I think we need some mechanism to kick >> these syscalls which I assume means throwing a signal at it. > > Nice catch! How did you get it? Running pigz (armhf, debian) to compress stuff. > We always go through cpu_exec_end() > before serving a guest syscall and always go through cpu_exec_start() > before entering the guest code execution loop. If we always schedule > safe work on the current thread's queue then I think there's a way to > make it safe and avoid kicking syscalls. Not let the signals complete until safe work is done? > > Thanks, > Sergey -- Alex Bennée