From mboxrd@z Thu Jan 1 00:00:00 1970 Received: with ECARTIS (v1.0.0; list linux-mips); Mon, 22 Sep 2014 11:31:49 +0200 (CEST) Received: from resqmta-po-06v.sys.comcast.net ([96.114.154.165]:33524 "EHLO resqmta-po-06v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by eddie.linux-mips.org with ESMTP id S27009482AbaIVJbrdBq0k (ORCPT ); Mon, 22 Sep 2014 11:31:47 +0200 Received: from resomta-po-13v.sys.comcast.net ([96.114.154.237]) by resqmta-po-06v.sys.comcast.net with comcast id u9Xe1o00557bBgG019XeNm; Mon, 22 Sep 2014 09:31:38 +0000 Received: from [192.168.1.13] ([69.251.152.165]) by resomta-po-13v.sys.comcast.net with comcast id u9Xc1o00C3aNLgd019Xd5B; Mon, 22 Sep 2014 09:31:38 +0000 Message-ID: <541FEC77.3010905@gentoo.org> Date: Mon, 22 Sep 2014 05:31:35 -0400 From: Joshua Kinard User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: linux-mips@linux-mips.org Subject: Re: gcc-4.8+ and R10000+ References: <540C165F.7030307@gentoo.org> <540D7A7B.9090905@gentoo.org> In-Reply-To: <540D7A7B.9090905@gentoo.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1411378298; bh=0BGjuHoQ+k1PhU5OciqK0saJ8KanMQBHENNmiSg55WM=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=dh0lkaJmN9f6t8Lib+UnbSRfkDvTNMa3Iki76pXRuCb+dVatWTLn2iLKuAHBwQG/1 KHEk/KAfmFAa5IYncKsHQgBJF8Y4elwsCidhrGyPX8qxq0yuq+Ppt6i90W4y66QQMk XaCx+hZIwjI0tFXeP2vwFmUjIpelDhQxSqBltLogGLjqu5NajOPgUDu+MRkxXgAH0B hYYoVLdnwXhA4pFM+coeOWn1J4rHoJVhHLY0UOttP/IeFT2nVCpToUcfz3L8V2uVYu T1lVICc8x9SpkoYSOIGA/WHChW7pqfOu+lX1NN14zIGOuFbSAJ8NbYT5ARAwoXvN4c omPq7XGmn1BVQ== Return-Path: X-Envelope-To: <"|/home/ecartis/ecartis -s linux-mips"> (uid 0) X-Orcpt: rfc822;linux-mips@linux-mips.org Original-Recipient: rfc822;linux-mips@linux-mips.org X-archive-position: 42722 X-ecartis-version: Ecartis v1.0.0 Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org X-original-sender: kumba@gentoo.org Precedence: bulk List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: linux-mips X-List-ID: linux-mips List-subscribe: List-owner: List-post: List-archive: X-list: linux-mips On 09/08/2014 05:44, Joshua Kinard wrote: > On 09/08/2014 04:11, Miod Vallat wrote: >>> Disassembling a statically-built copy of the "sln" binary generated by >>> glibc's compile phase, there are slight differences in how gcc-4.7 and >>> gcc-4.8 are compiling the __lll_lock_wait_private function. The key >>> differences in the output asm are >>> this: >> >> [...] >> >>> gcc-4.8: >>> x+4 >>> ... >>> x+24 bne v1,v0, >>> ... >>> x+32 0x7c03e83b /* rdhwr */ >>> x+36 li a2,2 >>> x+40 lw a1,-29832(v1) >>> x+44 move a3,zero >>> x+48 li v0,4238 >>> x+52 syscall >>> * x+56 ll v0,0(s0) >>> * x+60 li at,2 >>> * x+64 sc at,0 >> >> Note how the sc address is no longer 0(s0). Since the address does >> not match the address used in the ll instruction, sc will always >> fail on the R10k. > > That would be a typo on my part. I typed that out by hand and just missed it. It should read: > > gcc-4.8: > x+4 > ... > x+24 bne v1,v0, > ... > x+32 0x7c03e83b /* rdhwr */ > x+36 li a2,2 > x+40 lw a1,-29832(v1) > x+44 move a3,zero > x+48 li v0,4238 > x+52 syscall > * x+56 ll v0,0(s0) > * x+60 li at,2 > * x+64 sc at,0(s0) > x+68 beqzl at, > x+72 nop > x+76 sync > x+80 bnez v0, I did some more tracing. It seems the issue with glibc itself stems from the addition of __atomic_* builtins added generally in gcc-4.7 and MIPS-specifically in gcc-4.8: >From ports/sysdeps/mips/bits/atomic.h (for 2.19) or sysdeps/mips/bits/atomic.h (for 2.20): /* The __atomic_* builtins are available in GCC 4.7 and later, but MIPS support for their efficient implementation was added only in GCC 4.8. We still want to use them even with GCC 4.7 for MIPS16 code where we have no assembly alternative available and want to avoid the __sync_* builtins if at all possible. */ #if __GNUC_PREREQ (4, 8) || (defined __mips16 && __GNUC_PREREQ (4, 7)) [snip] This is why the assembly is different between the two gcc versions. This same code is in the kernel's atomic.h copy under arch/mips/include/asm/ as well. I tested by removing the top part of the #if macro and basically forcing the inline versions only, then rebuilt glibc-2.20 with gcc-4.9.2 (20140921 prerelease), and lo and behold, sln executes and returns its usage information. When using the gcc internal builtins, a futex gets used, which is why I wasn't seeing futexes in 4.7-built copies of sln, only in 4.8 or greater-built copies. This means that the gcc internal __atomic_* builtins may be somewhat to blame for this problem on R1x000 systems. I traced the kernel side of the problem out and figured out that when the futex is taken by sln, the process gets frozen by the scheduler via a call to freezable_schedule() in function futex_wait_queue_me in kernel/futex.c. I added two printk statements, one before freezable_schedule() and one after, and the first statement executes (verified by dumping /proc/kmsg directly because dmesg itself generates futexes), but not the printk after. The printk after freezable_schedule() only executes when I ctrl+C the frozen process and it exits out of the futex code. I visually checked through include/linux/freezer.h and noticed that freezable_schedule eventually calls freezing(), which executes an atomic_read() on system_freezing_cnt. In the mips code, that just comes out as a pointer dereference of a volatile variable. I'm not certain, though, if in gcc's case, the use of volatile means it tries to use its builtin __atomic_ functions again, and tries to take another futex /while it's trying to take a futex/. Chicken and egg? So, could still very well be a gcc issue, or maybe it's something really subtle in the kernel code. I am not sure which. I at least know of a specific gcc commit that enables/disables the problem, and that's pointing the finger at gcc here. Ideas? -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org 4096R/D25D95E3 2011-03-28 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from resqmta-po-06v.sys.comcast.net ([96.114.154.165]:33524 "EHLO resqmta-po-06v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by eddie.linux-mips.org with ESMTP id S27009482AbaIVJbrdBq0k (ORCPT ); Mon, 22 Sep 2014 11:31:47 +0200 Message-ID: <541FEC77.3010905@gentoo.org> Date: Mon, 22 Sep 2014 05:31:35 -0400 From: Joshua Kinard MIME-Version: 1.0 Subject: Re: gcc-4.8+ and R10000+ References: <540C165F.7030307@gentoo.org> <540D7A7B.9090905@gentoo.org> In-Reply-To: <540D7A7B.9090905@gentoo.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-Path: Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-subscribe: List-owner: List-post: List-archive: To: linux-mips@linux-mips.org Message-ID: <20140922093135.Mhgt3iymA5Q70a1V4Vq4KJY-_Gy4HpgFanBpqsljRh0@z> On 09/08/2014 05:44, Joshua Kinard wrote: > On 09/08/2014 04:11, Miod Vallat wrote: >>> Disassembling a statically-built copy of the "sln" binary generated by >>> glibc's compile phase, there are slight differences in how gcc-4.7 and >>> gcc-4.8 are compiling the __lll_lock_wait_private function. The key >>> differences in the output asm are >>> this: >> >> [...] >> >>> gcc-4.8: >>> x+4 >>> ... >>> x+24 bne v1,v0, >>> ... >>> x+32 0x7c03e83b /* rdhwr */ >>> x+36 li a2,2 >>> x+40 lw a1,-29832(v1) >>> x+44 move a3,zero >>> x+48 li v0,4238 >>> x+52 syscall >>> * x+56 ll v0,0(s0) >>> * x+60 li at,2 >>> * x+64 sc at,0 >> >> Note how the sc address is no longer 0(s0). Since the address does >> not match the address used in the ll instruction, sc will always >> fail on the R10k. > > That would be a typo on my part. I typed that out by hand and just missed it. It should read: > > gcc-4.8: > x+4 > ... > x+24 bne v1,v0, > ... > x+32 0x7c03e83b /* rdhwr */ > x+36 li a2,2 > x+40 lw a1,-29832(v1) > x+44 move a3,zero > x+48 li v0,4238 > x+52 syscall > * x+56 ll v0,0(s0) > * x+60 li at,2 > * x+64 sc at,0(s0) > x+68 beqzl at, > x+72 nop > x+76 sync > x+80 bnez v0, I did some more tracing. It seems the issue with glibc itself stems from the addition of __atomic_* builtins added generally in gcc-4.7 and MIPS-specifically in gcc-4.8: