From: dave.martin@linaro.org (Dave Martin)
To: linux-arm-kernel@lists.infradead.org
Subject: Question on ARM inline assembly - stackframes, sp/fp usage, __naked
Date: Thu, 19 May 2011 10:47:53 +0100 [thread overview]
Message-ID: <20110519094753.GB2431@arm.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1105190858320.2374@localhost6.localdomain6>
On Thu, May 19, 2011 at 09:46:30AM +0100, Frank Hofmann wrote:
>
>
> On Wed, 18 May 2011, Dave Martin wrote:
>
> >On Wed, May 18, 2011 at 04:25:15PM +0100, Frank Hofmann wrote:
> >>Hi,
> >>
> >>
> >>with the work on the hibernation support code, I've managed to make
> >>it all C-with-inline-asm. This happens to work fine on OMAP3, but
> >>there's an issue when I build it for Samsung 64xx. It's about this
> >>code:
> >>
> >>
> >>int __naked swsusp_arch_resume(void)
> >>{
> >> register struct pbe *pbe;
> >> register void *sp asm("sp") = swsusp_resume_stack + PAGE_SIZE - 8;
> >
> >(Why - 8?)
>
> Sorry, my mistake, thought PUSH is post-decrement (empty
> descending). From that assumption and the 8-byte-alignment-at-call,
> the offset were necessary. You're right it's not needed.
> Thanks for spotting.
OK -- I thoughr you might be using the top two words for something else.
push is just the same as stmfd sp!,
>
> >
> >> register void *saved_ctx asm ("r0") = ctx;
> >>
> >> __asm__ __volatile__("" : "+sp"(sp)); /* don't optimize away ! */
> >
> >Is this an attempt to force gcc to change sp before the subsequent loop?
>
> That's exactly what it does.
>
> >I think you don't mean "+sp". Something like "+r" might be appropriate.
> >
> >This feels subtle and fragile to me.
>
> Looked this up; again you're right the direct use of "rXX" (or the
> higher reg names) for gcc ARM inline assembly input/output
> constraints is undocumented, but the "Extended ASM" section in the
> manual,
>
> http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
>
> states that the way to force a variable into a specific register is:
>
> register <type> variable asm("register") = ...;
>
> and then to make sure it happens, to create an
> empty-asm-with-constraint that forces a register write,
>
> asm volatile("" : "+r"(variable));
>
> For the paranoid, one could do:
>
> asm volatile("mov sp, %0\n\t" : "+r"(variable) : : "sp");
>
> which is, unlike the use of "sp" as input/output register, actually
> documented explicitly in the "Extended ASM" section of gcc's manual,
> quote:
>
> If you refer to a particular hardware register from the
> assembler code, you will probably have to list the register
> after the third colon to tell the compiler the register's
> value is modified. In some assemblers, the register names
> begin with `%'; to produce one `%' in the assembler code,
> you must write `%%' in the input.
>
> (That's the only place where the doc acknowledges gcc's inline
> assembler's knowledge of machine register names as constraints - in
> the clobber list)
I think the clobber list is a different thing from the constraint
lists: in the clobber list, you can only have register names and a few
special things like "cc" and "memory".
In the constraint lists, you can only have constraints, not register
names. On some arches, there are constraints which map to a specific
register (such as the "a", "b", "c", "d" constraints on x86), but
I don't think there's anything like this for ARM.
As you observe, the register ... asm("rX") declarations are needed
instead to map arguments to specific registers.
>
> [ ... ]
> >Probably this is a side-effect of building for ARM versus Thumb-2.
> >ARM1176 only supports ARM, and fp is used as the framepointer when
> >building a kernel in ARM.
>
> Both targets in my compile are ARM, but yes you're right
> framepointer or no is a difference in the two defconfigs.
>
> >
> >Building that file with -fomit-frame-pointer avoids the error,
> >but that's best avoided unless there's a really good reason.
>
> The very specific purpose of this function ... it'd never be
> possible to backtrace over it in any case.
Yep, that was my assumption. If a fault happens in this code, the
battle is probably already lost.
>
> >
> >You would get the same problem when building for Thumb-2 if you pass
> >-fno-omit-frame-pointer to the compiler and attempt to clobber r7
> >(by default, Thumb-2 code has no framepointer).
> >
> [ ... ]
> >>I am wondering if there's a way to convince gcc to do the right
> >>thing here, __naked notwithstanding.
> >
> >I think basically luck is the answer. If you ask the compiler to saw
> >its own head off it may occasionally make a neat job of it, but don't
> >bet on that happening every time...
>
> The head-sawing-off piece works fine (substituting the
> stackpointer), but to convince it that it shouldn't attempt to
> facepalm itself afterwards looks like a bit harder ;-)
Well, no analogy is perfect ;)
>
>
> >
> >>
> >>Is the use of a naked "bounce function" for this specific purpose
> >>(switch the stackpointer) acceptable, and guaranteed to create the
> >>correct code ?
> >>
> >>
> >>It'd be nice not having to dump this piece into assembly...
> >
> >Using out-of-line assembler avoids the ABI issues, and is often
> >more readable IMHO.
> >
> >On Wed, May 18, 2011 at 06:50:07PM +0200, Mikael Pettersson wrote:
> >[...]
> >>Just write the stack switching fragment in proper assembly, and
> >>keep both sides of it in normal C. End of problem.
> >
> >Indeed, though there is only a little section of C, in the middle
> >of some assembler which is not implementable in C.
>
> The stack switching segment isn't the problem ... it's the C bit, as
> long as it lives in the same func.
>
> >
> >
> >Maybe something like this:
> >
> >
> >ENTRY(swsusp_arch_resume)
> > ldr sp, resume_sp
> > bl __swsusp_arch_resume_copy_pages
> > /* the rest of your code code goes here ... */
> >
> >.align 2
> >resume_sp: .long swsusp_resume_stack + PAGE_SIZE - 8
> >ENDPROC(swsusp_arch_resume)
>
> Yow, morale of the exercise, I guess: It's possible to shoehorn GCC
> into doing what I want, but the shoe isn't a good fit ;-)
>
> I'm still not convinced why:
>
> void __naked swsusp_arch_resume(void)
> {
> register void *sp asm("sp") = ...;
> asm volatile("" : "+r"(sp));
> __swsusp_arch_resume_copy_pages();
> __swsusp_arch_resume_cpu();
> }
>
> with two explicit "noinline" funcs wouldn't be ok.
As I said, it may work, today. It just feels more complex and
failure-prone, because it reiles on details of the compiler behaviour.
If doing this provided a real benefit, like better maintainability or
performance, then it would be worth trying. But I feel we don't get
any real advantages in this case.
That's just my opinion, but I know from experience how hard it is to get
compiler guys on your side when this kind of hack goes wrong...
>
> >
> >
> >Then in C:
> >
> >asmlinkage void __swsusp_arch_resume_copy_pages(void)
> >{
> > struct pbe *pbe;
> >
> > for (pbe = restore_pblist; pbe; pbe = pbe->next)
> > copy_page(pbe->orig_address, pbe->address);
> >}
> >
> >
> >There's now nothing for the compiler to get wrong...
>
> Maybe I'm trying too hard ;-) all other architectures supporting
> suspend-to-disk haven't dared to put this thing into C ... even
> though they all do this very operation as well. But then, none of
> the other swsusp_arch_resume elsewhere has done the stackpointer
> switch either.
Well, the more we can put in C the better, because C code is easier to
maintain and reuse -- hence it's good to be calling copy_pages() instead
of re-coding this in the assembly code. But there's a difference between
coding in C and shoehorning assembler into a .c file.
For totally arch-specific parts like saving/restoring the processor
state, at least some assembler is unavoidable...
Cheers
---Dave
next prev parent reply other threads:[~2011-05-19 9:47 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-18 15:25 Question on ARM inline assembly - stackframes, sp/fp usage, __naked Frank Hofmann
2011-05-18 16:06 ` Frank Hofmann
2011-05-18 16:50 ` Mikael Pettersson
2011-05-18 17:45 ` Dave Martin
2011-05-19 8:46 ` Frank Hofmann
2011-05-19 9:47 ` Dave Martin [this message]
2011-05-19 15:25 ` Frank Hofmann
2011-05-19 11:08 ` Måns Rullgård
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110519094753.GB2431@arm.com \
--to=dave.martin@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).