From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.transmode.se (smtp.transmode.se [31.15.61.139]) by lists.ozlabs.org (Postfix) with ESMTP id 417C81A001E for ; Wed, 23 Sep 2015 06:07:54 +1000 (AEST) From: Joakim Tjernlund To: "scottwood@freescale.com" CC: "linuxppc-dev@lists.ozlabs.org" , "linux-kernel@vger.kernel.org" , "paulus@samba.org" Subject: Re: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline Date: Tue, 22 Sep 2015 20:07:51 +0000 Message-ID: <1442952471.29498.60.camel@transmode.se> References: <1442945547.29498.50.camel@transmode.se> <1442948339.19102.270.camel@freescale.com> <1442950473.29498.54.camel@transmode.se> <1442950926.19102.280.camel@freescale.com> <1442951752.29498.58.camel@transmode.se> In-Reply-To: <1442951752.29498.58.camel@transmode.se> Content-Type: text/plain; charset="iso-8859-15" MIME-Version: 1.0 List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , > > And generally the one proposing uglification-for-optimization should pr= ovide=20 > > the evidence. :-) >=20 > When it comes to gcc, past history is my evidence until proven otherwise = :) > Maybe I will check again ... OK then: static inline void mb(void) { __asm__ __volatile__ ("sync" : : : "memory"); } static inline void dcbf(void *addr) { __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); } #define L1_CACHE_SHIFT 5 #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) void flush_dcache_range(unsigned long start, unsigned long stop) { void *addr =3D (void *)(start & ~(L1_CACHE_BYTES - 1)); unsigned int size =3D stop - (unsigned long)addr + (L1_CACHE_BYTES -= 1); unsigned int i; for (i =3D 0; i < size >> L1_CACHE_SHIFT; i++, addr +=3D L1_CACHE_BY= TES) dcbf(addr); if (i) mb(); /* sync */ } gives: flush_dcache_range: stwu %r1,-16(%r1) rlwinm %r3,%r3,0,0,26 addi %r4,%r4,31 subf %r9,%r3,%r4 srwi. %r10,%r9,5 beq %cr0,.L1 mtctr %r10 .p2align 4,,15 .L4: #APP # 8 "gccloop.c" 1 dcbf 0, %r3 # 0 "" 2 #NO_APP addi %r3,%r3,32 bdnz .L4 #APP # 3 "gccloop.c" 1 sync # 0 "" 2 #NO_APP .L1: addi %r1,%r1,16 blr good enough :) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934603AbbIVUH4 (ORCPT ); Tue, 22 Sep 2015 16:07:56 -0400 Received: from smtp.transmode.se ([31.15.61.139]:55961 "EHLO smtp.transmode.se" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933378AbbIVUHz convert rfc822-to-8bit (ORCPT ); Tue, 22 Sep 2015 16:07:55 -0400 From: Joakim Tjernlund To: "scottwood@freescale.com" CC: "linuxppc-dev@lists.ozlabs.org" , "linux-kernel@vger.kernel.org" , "paulus@samba.org" Subject: Re: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline Thread-Topic: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline Thread-Index: AQHQ9VspxrgpWLe1w0WBxWBczXIu3p5IuDyAgAANAICAAAnwgIAAAhwAgAAD2QCAAANZgA== Date: Tue, 22 Sep 2015 20:07:51 +0000 Message-ID: <1442952471.29498.60.camel@transmode.se> References: <1442945547.29498.50.camel@transmode.se> <1442948339.19102.270.camel@freescale.com> <1442950473.29498.54.camel@transmode.se> <1442950926.19102.280.camel@freescale.com> <1442951752.29498.58.camel@transmode.se> In-Reply-To: <1442951752.29498.58.camel@transmode.se> Accept-Language: en-US, sv-SE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Evolution 3.16.5 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [192.168.200.4] Content-Type: text/plain; charset=US-ASCII Content-ID: Content-Transfer-Encoding: 7BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > And generally the one proposing uglification-for-optimization should provide > > the evidence. :-) > > When it comes to gcc, past history is my evidence until proven otherwise :) > Maybe I will check again ... OK then: static inline void mb(void) { __asm__ __volatile__ ("sync" : : : "memory"); } static inline void dcbf(void *addr) { __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); } #define L1_CACHE_SHIFT 5 #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) void flush_dcache_range(unsigned long start, unsigned long stop) { void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); unsigned int size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); unsigned int i; for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) dcbf(addr); if (i) mb(); /* sync */ } gives: flush_dcache_range: stwu %r1,-16(%r1) rlwinm %r3,%r3,0,0,26 addi %r4,%r4,31 subf %r9,%r3,%r4 srwi. %r10,%r9,5 beq %cr0,.L1 mtctr %r10 .p2align 4,,15 .L4: #APP # 8 "gccloop.c" 1 dcbf 0, %r3 # 0 "" 2 #NO_APP addi %r3,%r3,32 bdnz .L4 #APP # 3 "gccloop.c" 1 sync # 0 "" 2 #NO_APP .L1: addi %r1,%r1,16 blr good enough :)