From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HeE6O-0001Pe-19 for qemu-devel@nongnu.org; Wed, 18 Apr 2007 13:37:56 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HeE6M-0001PO-D2 for qemu-devel@nongnu.org; Wed, 18 Apr 2007 13:37:54 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HeE6M-0001PL-5d for qemu-devel@nongnu.org; Wed, 18 Apr 2007 13:37:54 -0400 Received: from grayson.netsweng.com ([207.235.77.11]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1HeE1U-0000Ok-GE for qemu-devel@nongnu.org; Wed, 18 Apr 2007 13:32:52 -0400 Received: from amavis by grayson.netsweng.com with scanned-ok (Exim 3.36 #1 (Debian)) id 1HeE1U-0003b0-00 for ; Wed, 18 Apr 2007 13:32:52 -0400 Received: from grayson.netsweng.com ([127.0.0.1]) by localhost (grayson.netsweng.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dYp5eyOz3kvp for ; Wed, 18 Apr 2007 13:32:33 -0400 (EDT) Received: from h211.241.141.67.ip.alltel.net ([67.141.241.211] helo=trantor.stuart.netsweng.com) by grayson.netsweng.com with esmtp (Exim 3.36 #1 (Debian)) id 1HeE0x-0003aQ-00 for ; Wed, 18 Apr 2007 13:32:19 -0400 Date: Wed, 18 Apr 2007 13:31:52 -0400 (EDT) From: Stuart Anderson Subject: Re: [Qemu-devel] linux-user target In-Reply-To: Message-ID: References: <1176228712.22569.27.camel@jma4.dev.netgem.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On Tue, 17 Apr 2007, Stuart Anderson wrote: > I've continued to work on this all week, and I still haven't managed to > solve it. I've chased down a lot of paths, but none of them have lead to > a solution. Here is a summary of the situation now. > > * programs other than bash will run > * bash --version will run > * bash --noediting will run > * occasionally, bash has run if I'm stracing it, but I can't always > reproduce it. > * when it runs, I occasionally see some odd behavior, but not always. > The termios patch I just sent cleared up a lot of the oddness. > * when it runs, it hangs on exit. Killing it logs me all the way out > of the system (ssh conection). > * when it crashes, gdb looses the user level thread, so I can't do any > debugging > * I don't see any of the TLS related system calls being called. I also > don't see any concrete proof one way or another that it is used in > the executable (ie No R_PPC_*TLS relocations). I've been digging in > the kernel & glibc source, and I don't see a lot of special code to > support TLS on ppc. It mostly seems to be just taking care to not > step on R2. Glibc seems to be the only place where it knows something > specific about TLS, which leads me to think that TLS is mostly > contain within the userspace on PPC. > * I've tried turning on most of the DEBUG_ defines under linux-user, > but none of them has yielded anything useful, or noteworthy. This morning, I went back and tried a 32-bit x86 host (instead of the x86_64 host), and discovered that everything works just fine. This makes me think it's a 64 bit issue, so I took a closer look at the build warnings that exist on x86_64 but not on x86. This pointed to PPC_OP(goto_tb0) & PPC_OP(goto_tb1) in target-ppc/op.c. It appears that x86_64 is using the generic portable code, but one of the fields that it is taking as a pointer (tb_next) is only an int. Changing it to a ulong didn't fix things though, but it did eliminate the warning. After more digging in the qemu.log, I noticed this difference that is related to those two functions (op_goto_tb0 & op_goto_tb1). On x86: 00000ebf : ebf: e9 fc ff ff ff jmp ec0 ec4: c3 ret 00000ec5 : ec5: e9 fc ff ff ff jmp ec6 eca: c3 ret On x86_64: 000000000000154e : 154e: 8b 05 00 00 00 00 mov 0(%rip),%eax 1554: ff e0 jmpq *%rax 1556: f3 c3 repz retq 0000000000001558 : 1558: 8b 05 00 00 00 00 mov 0(%rip),%eax 155e: ff e0 jmpq *%rax 1560: f3 c3 repz retq Note repz before retq which is not in x86 code or in any other x86_64 op. In use the micro ops are: 0x000d: goto_tb1 0x60233800 0x000e: set_T1 0x100a4df8 0x000f: b_T1 For which the generated code becomes 0x61a5998d: mov -25321811(%rip),%eax # 0x60233840 0x61a59993: jmpq *%eax 0x61a59995: repz lea -1369131941(%rip),%r12d # 0x100a4df8 0x61a5999d: mov %r12d,%eax 0x61a599a0: and $0xfffffffffffffffc,%eax 0x61a599a3: mov %eax,0xc7f4(%r14) 0x61a599aa: lea -25321904(%rip),%r15d # 0x60233801 0x61a599b1: retq The repz is still there from the goto_tb1 OP, but is now applied to the lea isn from the set_T1 op. Is this correct? Would it cause any kind of a problem? Stuart Stuart R. Anderson anderson@netsweng.com Network & Software Engineering http://www.netsweng.com/ 1024D/37A79149: 0791 D3B8 9A4C 2CDC A31F BD03 0A62 E534 37A7 9149