From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56662) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZkB44-0003NT-0o for qemu-devel@nongnu.org; Thu, 08 Oct 2015 09:16:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZkB3y-0001Xd-3J for qemu-devel@nongnu.org; Thu, 08 Oct 2015 09:16:55 -0400 Received: from mail.sysgo.com ([176.9.12.79]:43042) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZkB3x-0001XT-UO for qemu-devel@nongnu.org; Thu, 08 Oct 2015 09:16:50 -0400 From: Rudolf Marek Message-ID: <56166CC0.1000600@sysgo.com> Date: Thu, 8 Oct 2015 15:16:48 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] x86 amd64 singlestepping bug through syscall instruction List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Andreas Gustafsson Hi all, I was told on IRC to use this ML to report the following bug. It seems that there is something wrong with QEMU with respect to handle t= he=20 singlestepping and AMD64 syscall instruction. The AMD "syscall" instruction will clear defined flag in the FMASK MSR. N= ormally=20 the TF flag is set there, so the first instruction when kernel is entered= after=20 syscall won't cause single step exception in the kernel. The observed scenario is a unhandled singlestep fault in the kernel or ho= st=20 reboot or QEMU crash. The possible way how to reproduce it is to single step through any functi= on=20 which does "syscall" instruction. After syscall is entered QEMU will trig= ger=20 singlestepping exception in the kernel despite that the TF is set in FMAS= K MSR.=20 Real HW behaves correctly and does not trigger this exception. What is interesting is that I was not able to trigger it if I just enable= d TF=20 and did the syscall instruction, perhaps for this bug is somewhat importa= nt to=20 have TF set for previous few instruction. I have stumbled to this problem while working with our custom OS. However= after=20 some googling I found out that the NetBSD guys (CCed) are having very sim= ilar=20 problem and I asked them to prepare a ISO image where the problem ends wi= th QEMU=20 SIGSEGV or host reboot. You can check original report here: http://gnats.netbsd.org/49603 The way how to reproduce the problem with NetBSD is pasted to the end of = this email. Unfortunately all I was able to do is to verify that QEMU has a code to c= lear=20 RFLAGS based on FMASK MSR. Any other help is very appreciated. Thanks, Rudolf > Hi Rudolf, > > Here's a more Linux-friendly recipe for reproducing the bug. A couple = of > gigabytes of of free disk space are needed for the uncompressed OS imag= e. > > wget http://www.gson.org/bugs/qemu/NetBSD-amd64-2015.08.01.16.18.47-= com0.img.gz > gunzip NetBSD-amd64-2015.08.01.16.18.47-com0.img.gz > qemu-system-x86_64 -nographic -snapshot NetBSD-amd64-2015.08.01.16.1= 8.47-com0.img > (wait for the qemu guest to boot to a login prompt) > (log in as root; there is no password) > gdb /bin/sync > break sync > run > stepi > stepi > stepi > (The qemu guest will either instantly reboot or hang, or qemu will s= egfault) > (On real hardware, you just get another gdb prompt, and gdb is still= responding) --=20 S p=C5=99=C3=A1telsk=C3=BDm pozdravem / Best regards / Mit freundlichen G= r=C3=BC=C3=9Fen Ing. Rudolf Marek SYSGO s.r.o. Zelen=C3=BD pruh 99 CZ-14800 Praha 4 Phone: +420 222138 111, +49 6136 9948 111 Fax: +420 296374890, +49 6136 9948 1 111 rudolf.marek@sysgo.com http://www.sysgo.com | http://www.elinos.com | http://www.pikeos.com