From mboxrd@z Thu Jan 1 00:00:00 1970 From: H. Peter Anvin Date: Sun, 12 Aug 2007 22:51:53 -0700 Subject: [Cluster-devel] Re: [PATCH] gfs2: better code for translating characters In-Reply-To: <91b13c310708122206v5e4023f2w7464611a96ae67d9@mail.gmail.com> References: <11869741183677-git-send-email-crquan@gmail.com> <91b13c310708122008w27b86359n5b135df3e229e616@mail.gmail.com> <46BFDDBB.2020104@zytor.com> <91b13c310708122206v5e4023f2w7464611a96ae67d9@mail.gmail.com> Message-ID: <46BFF179.8060308@zytor.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit rae l wrote: >>> >> Only if the compiler is stupid. > What? Did you know I really say? Could you tell a little more clear? > > if the string in sdp->sd_table_name has many '/' chars, the latter > algorithm will be better. > but if there's no '/' char, one assignment will be wasted. > You seem to have confused modern compiled C with an old BASIC interpreter. Consider the code in point: - while ((table = strchr(sdp->sd_table_name, '/'))) + table = sdp->sd_table_name; + while ((table = strchr(table, '/'))) *table = '_'; sdp->sd_table_name refers to a memory location, and will have to be loaded from memory into a register before it can be transmitted to the strchr() function. In the latter case, we call this register "table"; since the value is immediately killed after the function call, there is no reason for the compiler to carry it across the function. Consider x86-64 as an example: # Assume sdp is held in %r15 at this point, and assume # the offset of sd_table_name is 0x30. # First case .L1: movq 30(%r15), %rdi # First argument register movb '/', %sil # Second argument register call strchr testq %rax, %rax # Result register jz .L2 movb '_', (%rax) jmp .L1 .L2: # Second case movq 30(%r15), %rdi .L1: movb '/', %sil call strchr testq %rax, %rax jz .L2 movq %rax, %rdi movb '_', (%rax) jmp .L1 As you can see, in the zero case, the instruction sequence is exactly the same, whereas in the nonzero case, we have replaced a memory load with a register-register copy. On most architectures (x86-64, Alpha and MIPS are the oddballs here) we wouldn't even need the copy. -hpa