From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <50520C6D.2000700@xenomai.org> Date: Thu, 13 Sep 2012 18:40:13 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <1347469235-17712-1-git-send-email-wolfgang.mauerer@siemens.com> <1347469235-17712-4-git-send-email-wolfgang.mauerer@siemens.com> <505180A8.2010100@xenomai.org> <50519D74.1060801@siemens.com> <5051AA22.2030107@xenomai.org> <5051D85A.4090105@siemens.com> <5051DF0A.7030901@xenomai.org> <50520907.3080101@siemens.com> In-Reply-To: <50520907.3080101@siemens.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] [PATCH 3/3] Fix IRQs-off-tracer for x86_64 List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wolfgang Mauerer Cc: "xenomai@xenomai.org" On 09/13/2012 06:25 PM, Wolfgang Mauerer wrote: > On 13/09/12 15:26, Gilles Chanteperdrix wrote: >> On 09/13/2012 02:58 PM, Wolfgang Mauerer wrote: >>> On 13/09/12 11:40, Gilles Chanteperdrix wrote: >>>> On 09/13/2012 10:46 AM, Wolfgang Mauerer wrote: >>>>> On 13/09/12 08:43, Gilles Chanteperdrix wrote: >>> >>> ... (some code with varying amounts of pops and pushed)... >>> >>>>> that fixes the bug, too -- thanks. However, I deliberately >>>>> spent some extra assembler code (5 insns compared to your >>>>> approach) to make the code easier to understand (the overall >>>>> size of the resulting kernel binary is identical for both >>>>> approaches on my machine, btw.). Since the overhead of 5 >>>>> instructions is negligible compared to the two calls of >>>>> __ipipe_trace that are necessary in any case, I'd argue that >>>>> the more straightforward code is beneficial in the long run, >>>>> but that's for the maintainers to decide. >>>> >>>> To the contrary, the profusion of pushq and popq looked >>>> confusing to me, I wondered why so many were needed, so, I >>>> guess it is all a matter of taste. >>> >>> most certainly a matter of taste, agreed. I've rearranged your >>> suggestion slightly, added a comment or two, and turned it into >>> a patch. Repo at https://github.com/siemens/ipipe.git >>> for-upstream is also updated. >> >> Ok. Small nit: >> >> + movq ORIG_RAX+8-RBP(%rsp), %rdi # IRQ number... notq %rdi # >> ...is inverted, fix up + leaq RIP-8+8-RBP(%rsp), %rbp # Show >> interrupted address in trace >> >> I would avoid to use a register as a source in an instruction >> right after an instruction where it was used as a destination, >> because I believe in processors which do not have out of order >> execution, such as Intel Atoms, the processor would have to wait >> for an instruction to be finished to start the next. Though I do >> not know enough of the x86 architecture to know if that is true. >> That is the reason why I put leaq before notq. > > thanks for the suggestion. Indeed even on a 2.7GHz Xeon where all CPU > trickery is available, the assembler instruction reordering gives a > speed-up of about 0.35ns, so one cycle. However, for any > application with such extreme performance needs, tracing would be > turned off anyway, wouldn't it? For debugging purposes, it does make no > difference whatsoever, so I'd argue that code clarity is the more > valuable optimisation goal in this case. The tracer speed is what makes the tracer usable or not, at all, on low end platforms (where it is probably the most needed). I agree that a few cycles at this point are not so important. On the other hand, I do not see how inverting two instructions makes the code more or less readable. It is assembly, it is not readable, no matter what. And it is so easy to put the instructions in the right order that I see no reason to avoid doing it. Anyway, I will merge your patches, thanks for taking care of this issue. -- Gilles.