From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41321) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YTZT4-0000fO-Lf for qemu-devel@nongnu.org; Thu, 05 Mar 2015 12:21:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YTZT0-0000cJ-EX for qemu-devel@nongnu.org; Thu, 05 Mar 2015 12:21:50 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39962) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YTZT0-0000c4-8E for qemu-devel@nongnu.org; Thu, 05 Mar 2015 12:21:46 -0500 Message-ID: <54F890A7.4050608@redhat.com> Date: Thu, 05 Mar 2015 12:21:43 -0500 From: John Snow MIME-Version: 1.0 References: <201503040948.05637.wpaul@windriver.com> In-Reply-To: <201503040948.05637.wpaul@windriver.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] Fix bug in implementation of SYSRET instruction for x86-64 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bill Paul , qemu-devel@nongnu.org Cc: Paolo Bonzini , rth@twiddle.net CC'ing X86 maintainers. On 03/04/2015 12:48 PM, Bill Paul wrote: > Hi guys. I seem to have found a bug in the helper_systet() function in > > target-i386/seg_helper.c. I downloaded the Intel architecture manual > from here: > > http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html > > And it describes the behavior of SYSRET with regards to the stack > selector (SS) as follows: > > SS.Selector <-- (IA32_STAR[63:48]+8) OR 3; (* RPL forced to 3 *) > > In other words, the value of the SS register is supposed to be loaded > from bits 63:48 of the IA32_STAR model-specific register (MSR), > incremented by 8, _AND_ ORed with 3. ORing in the 3 sets the privilege > level to 3 (user). This is done since SYSRET returns to user mode after > a system call. > > The code in helper_sysret() performs the "increment by 8" step, but > _NOT_ the "OR with 3" step. > > Here's another description of the SYSRET instruction which says > basically the same thing, though in slightly different format: > > http://tptp.cc/mirrors/siyobik.info/instruction/SYSRET.html > > [...] > > SS(SEL) = IA32_STAR[63:48] + 8; > > SS(PL) = 0x3; > > [...] > > The effect of this bug is that when x86-64 code uses the SYSCALL > instruction to enter kernel mode, the SYSRET instruction will return the > CPU to user mode with the SS register incorrectly set (the privilege > level bits will be 0). In my case, the original SS value for user mode > was 0x2B, but after the return from SYSRET, it was changed to 0x28. It > took me quite some time to realize this was due to a bug in QEMU rather > than my own code. > > This caused a problem later when the user-mode code was preempted by an > interrupt. At the end of the interrupt handling, an IRET instruction was > used to exit the interrupt context, and the IRET instruction would > generate a general protection fault because it would attempt to validate > the stack selector value and notice that it was set for PL 0 > (supervisor) instead of PL 3 (user). > > From my reading, the code is quite clearly in error with respect to the > Intel documentation, and fixing the bug in my local sources results in > my test code (which runs correctly on real hardware) working correctly > in QEMU. It seems this bug has been there for a long time -- I happened > to have the sources for QEMU 0.10.5 and the bug is there too (in > target-i386/op_helper.c). I'm puzzled as to how it's gone unnoticed for > so long, which has me thinking that maybe I'm missing something. I'm > pretty sure this is wrong though. > > I'm including a patch to fix this below. It seems to fix the problem > quite nicely on my QEMU 2.2.0 installation. I'm also attaching a > separate copy in case my mail client mangles the formatting on the > inline copy. > > -Bill > > -- > > ============================================================================= > > -Bill Paul (510) 749-2329 | Senior Member of Technical Staff, > > wpaul@windriver.com | Master of Unix-Fu - Wind River Systems > > ============================================================================= > > "I put a dollar in a change machine. Nothing changed." - George Carlin > > ============================================================================= > > Signed-off-by: Bill Paul > > --- > > target-i386/seg_helper.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/target-i386/seg_helper.c b/target-i386/seg_helper.c > > index fa374d0..2bc757a 100644 > > --- a/target-i386/seg_helper.c > > +++ b/target-i386/seg_helper.c > > @@ -1043,7 +1043,7 @@ void helper_sysret(CPUX86State *env, int dflag) > > DESC_CS_MASK | DESC_R_MASK | DESC_A_MASK); > > env->eip = (uint32_t)env->regs[R_ECX]; > > } > > - cpu_x86_load_seg_cache(env, R_SS, selector + 8, > > + cpu_x86_load_seg_cache(env, R_SS, (selector + 8) | 3, > > 0, 0xffffffff, > > DESC_G_MASK | DESC_B_MASK | DESC_P_MASK | > > DESC_S_MASK | (3 << DESC_DPL_SHIFT) | > > @@ -1056,7 +1056,7 @@ void helper_sysret(CPUX86State *env, int dflag) > > DESC_S_MASK | (3 << DESC_DPL_SHIFT) | > > DESC_CS_MASK | DESC_R_MASK | DESC_A_MASK); > > env->eip = (uint32_t)env->regs[R_ECX]; > > - cpu_x86_load_seg_cache(env, R_SS, selector + 8, > > + cpu_x86_load_seg_cache(env, R_SS, (selector + 8) | 3, > > 0, 0xffffffff, > > DESC_G_MASK | DESC_B_MASK | DESC_P_MASK | > > DESC_S_MASK | (3 << DESC_DPL_SHIFT) | > > -- > > 1.8.0 >