* [PATCH] Document LWS ABI.
@ 2008-07-15 12:33 Carlos O'Donell
2008-07-15 17:32 ` Helge Deller
0 siblings, 1 reply; 7+ messages in thread
From: Carlos O'Donell @ 2008-07-15 12:33 UTC (permalink / raw)
To: Kyle McMartin, Helge Deller, John David Anglin, linux-parisc
[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]
Helge,
The LWS interface *already* has a 64-bit runtime entry point, see
arch/parisc/kernel/syscall.S:
~~~
.align PAGE_SIZE
/* Light-weight-syscall table */
/* Start of lws table. */
ENTRY(lws_table)
LWS_ENTRY(compare_and_swap32) /* 0 - ELF32 Atomic compare and swap */
LWS_ENTRY(compare_and_swap64) /* 1 - ELF64 Atomic compare and swap */
END(lws_table)
/* End of lws table */
~~~
The entry path unconditionally clips the LWS # to 32-bits.
On a 32-bit kernel, calling LWS entry #0 carries out the CAS on the
untouched 32-bit registers.
On a 32-bit kernel, calling LWS entry #1 returns ENOSYS.
On a 64-bit kernel, calling LWS entry #0 clips all register to 32-bits
and carries out the CAS.
On a 64-bit kernel, calling LWS entry #1 carries out the CAS on the
untouched 64-bit registers.
Patch attached.
John, could you comment on the ABI wrt the 64-bit runtime?
Helge, please review. I rolled in your fix for the superfluous .align
16. I would like your signed-off-by if you think the patch is good.
[PARISC] Document LWS ABI and LWS cleanups.
Document the LWS ABI including implementation notes for
userspace, and comment cleanup.
Remove extraneous .align 16 after lws_lock_start.
Signed-off-by: Carlos O'Donell <carlos@systemhalted.org>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: syscall.S.diff --]
[-- Type: text/x-diff; name=syscall.S.diff, Size: 2401 bytes --]
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index 69b6eeb..3fc73ad 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -365,17 +365,51 @@ tracesys_sigexit:
/*********************************************************
- Light-weight-syscall code
+ 32/64-bit Light-Weight-Syscall ABI
- r20 - lws number
- r26,r25,r24,r23,r22 - Input registers
- r28 - Function return register
- r21 - Error code.
+ * - Indicates a hint for userspace inline asm
+ implementations.
- Scracth: Any of the above that aren't being
- currently used, including r1.
+ Syscall number (caller-saves)
+ - %r20
+ * In asm clobber.
- Return pointer: r31 (Not usable)
+ Argument registers (caller-saves)
+ - %r26, %r25, %r24, %r23, %r22
+ * In asm input.
+
+ Return registers (caller-saves)
+ - %r28 (return), %r21 (errno)
+ * In asm output.
+
+ Caller-saves registers
+ - %r1, %r27, %r29
+ - %r2 (return pointer)
+ - %r31 (ble link register)
+ * In asm clobber.
+
+ Callee-saves registers
+ - %r3-%r18
+ - %r30 (stack pointer)
+ * Not in asm clobber.
+
+ If userspace is 32-bit:
+ Callee-saves registers
+ - %r19 (32-bit PIC register)
+
+ Differences from 32-bit calling convention:
+ - Syscall number in %r20
+ - Additional argument register %r22 (arg4)
+ - Callee-saves %r19.
+
+ If userspace is 64-bit:
+ Callee-saves registers
+ - %r27 (64-bit PIC register)
+
+ Differences from 64-bit calling convention:
+ - Syscall number in %r20
+ - Additional argument register %r22 (arg4)
+ - Callee-saves %r27.
Error codes returned by entry path:
@@ -473,7 +507,8 @@ lws_compare_and_swap64:
b,n lws_compare_and_swap
#else
/* If we are not a 64-bit kernel, then we don't
- * implement having 64-bit input registers
+ * have 64-bit input registers, and calling
+ * the 64-bit LWS CAS returns ENOSYS.
*/
b,n lws_exit_nosys
#endif
@@ -635,12 +670,15 @@ END(sys_call_table64)
/*
All light-weight-syscall atomic operations
will use this set of locks
+
+ NOTE: The lws_lock_start symbol must be
+ at least 16-byte aligned for safe use
+ with ldcw.
*/
.section .data
.align PAGE_SIZE
ENTRY(lws_lock_start)
/* lws locks */
- .align 16
.rept 16
/* Keep locks aligned at 16-bytes */
.word 1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] Document LWS ABI. 2008-07-15 12:33 [PATCH] Document LWS ABI Carlos O'Donell @ 2008-07-15 17:32 ` Helge Deller 2008-07-15 19:51 ` Carlos O'Donell 0 siblings, 1 reply; 7+ messages in thread From: Helge Deller @ 2008-07-15 17:32 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Kyle McMartin, John David Anglin, linux-parisc Carlos O'Donell wrote: > Helge, > > The LWS interface *already* has a 64-bit runtime entry point, see > arch/parisc/kernel/syscall.S: > [....] Sure. I thought the discussion was about adding a 64-bit runtime entry point which operates on a 64-bit pointer. But this can be done discussed when 64-bit userspace is coming... > Helge, please review. I rolled in your fix for the superfluous .align > 16. I would like your signed-off-by if you think the patch is good. > > [PARISC] Document LWS ABI and LWS cleanups. > > Document the LWS ABI including implementation notes for > userspace, and comment cleanup. > > Remove extraneous .align 16 after lws_lock_start. > > Signed-off-by: Carlos O'Donell <carlos@systemhalted.org> Signed-off-by: Helge Deller <deller@gmx.de> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Document LWS ABI. 2008-07-15 17:32 ` Helge Deller @ 2008-07-15 19:51 ` Carlos O'Donell 2008-07-16 0:15 ` John David Anglin 0 siblings, 1 reply; 7+ messages in thread From: Carlos O'Donell @ 2008-07-15 19:51 UTC (permalink / raw) To: Helge Deller; +Cc: Kyle McMartin, John David Anglin, linux-parisc [-- Attachment #1: Type: text/plain, Size: 1011 bytes --] On Tue, Jul 15, 2008 at 1:32 PM, Helge Deller <deller@gmx.de> wrote: >> The LWS interface *already* has a 64-bit runtime entry point, see >> arch/parisc/kernel/syscall.S: >> [....] > > Sure. I thought the discussion was about adding a 64-bit runtime entry point > which operates on a 64-bit pointer. But this can be done discussed when > 64-bit userspace is coming... The entry point is already there. It's LWS #1. Apparnetly I created it with a 64-bit userspace in mind. >> Helge, please review. I rolled in your fix for the superfluous .align >> 16. I would like your signed-off-by if you think the patch is good. >> >> [PARISC] Document LWS ABI and LWS cleanups. >> >> Document the LWS ABI including implementation notes for >> userspace, and comment cleanup. >> >> Remove extraneous .align 16 after lws_lock_start. >> >> Signed-off-by: Carlos O'Donell <carlos@systemhalted.org> > > Signed-off-by: Helge Deller <deller@gmx.de> Thanks. Kyle, Please apply the patch and send it upstream? Cheers, Carlos. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: syscall.S.diff --] [-- Type: text/x-diff; name=syscall.S.diff, Size: 2688 bytes --] [PARISC] Document LWS ABI and LWS cleanups. Document the LWS ABI including implementation notes for userspace, and comment cleanup. Remove extraneous .align 16 after lws_lock_start. Signed-off-by: Carlos O'Donell <carlos@systemhalted.org> Signed-off-by: Helge Deller <deller@gmx.de> diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S index 69b6eeb..3fc73ad 100644 --- a/arch/parisc/kernel/syscall.S +++ b/arch/parisc/kernel/syscall.S @@ -365,17 +365,51 @@ tracesys_sigexit: /********************************************************* - Light-weight-syscall code + 32/64-bit Light-Weight-Syscall ABI - r20 - lws number - r26,r25,r24,r23,r22 - Input registers - r28 - Function return register - r21 - Error code. + * - Indicates a hint for userspace inline asm + implementations. - Scracth: Any of the above that aren't being - currently used, including r1. + Syscall number (caller-saves) + - %r20 + * In asm clobber. - Return pointer: r31 (Not usable) + Argument registers (caller-saves) + - %r26, %r25, %r24, %r23, %r22 + * In asm input. + + Return registers (caller-saves) + - %r28 (return), %r21 (errno) + * In asm output. + + Caller-saves registers + - %r1, %r27, %r29 + - %r2 (return pointer) + - %r31 (ble link register) + * In asm clobber. + + Callee-saves registers + - %r3-%r18 + - %r30 (stack pointer) + * Not in asm clobber. + + If userspace is 32-bit: + Callee-saves registers + - %r19 (32-bit PIC register) + + Differences from 32-bit calling convention: + - Syscall number in %r20 + - Additional argument register %r22 (arg4) + - Callee-saves %r19. + + If userspace is 64-bit: + Callee-saves registers + - %r27 (64-bit PIC register) + + Differences from 64-bit calling convention: + - Syscall number in %r20 + - Additional argument register %r22 (arg4) + - Callee-saves %r27. Error codes returned by entry path: @@ -473,7 +507,8 @@ lws_compare_and_swap64: b,n lws_compare_and_swap #else /* If we are not a 64-bit kernel, then we don't - * implement having 64-bit input registers + * have 64-bit input registers, and calling + * the 64-bit LWS CAS returns ENOSYS. */ b,n lws_exit_nosys #endif @@ -635,12 +670,15 @@ END(sys_call_table64) /* All light-weight-syscall atomic operations will use this set of locks + + NOTE: The lws_lock_start symbol must be + at least 16-byte aligned for safe use + with ldcw. */ .section .data .align PAGE_SIZE ENTRY(lws_lock_start) /* lws locks */ - .align 16 .rept 16 /* Keep locks aligned at 16-bytes */ .word 1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Document LWS ABI. 2008-07-15 19:51 ` Carlos O'Donell @ 2008-07-16 0:15 ` John David Anglin 2008-07-16 0:54 ` Carlos O'Donell 0 siblings, 1 reply; 7+ messages in thread From: John David Anglin @ 2008-07-16 0:15 UTC (permalink / raw) To: Carlos O'Donell; +Cc: deller, kyle, dave.anglin, linux-parisc > >> The LWS interface *already* has a 64-bit runtime entry point, see > >> arch/parisc/kernel/syscall.S: > >> [....] > > > > Sure. I thought the discussion was about adding a 64-bit runtime entry point > > which operates on a 64-bit pointer. But this can be done discussed when > > 64-bit userspace is coming... > > The entry point is already there. It's LWS #1. Apparnetly I created it > with a 64-bit userspace in mind. You didn't respond to the issue. > Please apply the patch and send it upstream? This ABI change is unjustified and I won't approve the Helge's GCC patch with this change. You will also have to change the glibc code to match. Kyle indicated that he didn't want any major changes to the LWS interface, so reserving several new registers is unnecessary. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Document LWS ABI. 2008-07-16 0:15 ` John David Anglin @ 2008-07-16 0:54 ` Carlos O'Donell 2008-07-16 3:05 ` John David Anglin 0 siblings, 1 reply; 7+ messages in thread From: Carlos O'Donell @ 2008-07-16 0:54 UTC (permalink / raw) To: John David Anglin; +Cc: deller, kyle, dave.anglin, linux-parisc On Tue, Jul 15, 2008 at 8:15 PM, John David Anglin <dave@hiauly1.hia.nrc.ca> wrote: >> > Sure. I thought the discussion was about adding a 64-bit runtime entry point >> > which operates on a 64-bit pointer. But this can be done discussed when >> > 64-bit userspace is coming... >> >> The entry point is already there. It's LWS #1. Apparnetly I created it >> with a 64-bit userspace in mind. > > You didn't respond to the issue. It doesn't work for the 64-bit runtime, I only used stw/ldw, and it should be std/ldd. Which operation depends on the width of userspace, so in all likelyhood I have to just duplicate the code. No real problem there. >> Please apply the patch and send it upstream? > > This ABI change is unjustified and I won't approve the Helge's GCC patch > with this change. You will also have to change the glibc code to match. > > Kyle indicated that he didn't want any major changes to the LWS interface, > so reserving several new registers is unnecessary. All I have done is clarify the intended interface. There has been no actual change from what is already implemented. The glibc code already matches this interface. The question is "Are you OK with the existing ABI?" :-) Cheers, Carlos. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Document LWS ABI. 2008-07-16 0:54 ` Carlos O'Donell @ 2008-07-16 3:05 ` John David Anglin 2008-12-29 21:47 ` Helge Deller 0 siblings, 1 reply; 7+ messages in thread From: John David Anglin @ 2008-07-16 3:05 UTC (permalink / raw) To: Carlos O'Donell; +Cc: deller, kyle, dave.anglin, linux-parisc > The question is "Are you OK with the existing ABI?" :-) No. As I understand it, r2 doesn't need to be clobbered because glibc doesn't currently clobber it. So, using it in the LWS code would cause an ABI break. That's one register back to userspace. I want to keep r19 and r27 for userspace so the PIC register doesn't have to be saved and restored in the asm (linux-atomic.c is compiled as PIC code). You can have r29. That leaves three free registers for the LWS code: r22, r23 and r29. The LWS ABI has r1, r20-r26 and r28-r31. Userspace has two call-clobbered registers free across the asm in PIC code, and three in non-PIC code. That's enough to efficiently perform the error comparisons. The asm would be more efficient if the registers used for lws_mem, lws_old and lws_new were not written to. This occurs only for the call in the 32-bit runtime with a 64-bit kernel. As it stands, the lws_mem, lws_old and lws_new arguments get reloaded every time around the EAGAIN loop. This is the crucial code in the compare and swap: /* The load and store could fail */ 1: ldw 0(%sr3,%r26), %r28 sub,<> %r28, %r25, %r0 2: stw %r24, 0(%sr3,%r26) The sub,<> instruction uses a 32-bit compare/subtract condition, so the clipping of r25 isn't necessary. Similarly, the stw instruction ignores the most significant 32-bits of r24. The value in r26 needs clipping but you have three free registers, and it looks like r1 is also free at this point in the code. You can deposit the least significant 32-bits of r26 into a field of zeros in another register in one instruction. It looks like lws_compare_and_swap64 and lws_compare_and_swap32 become more or less functionally identical. The above would become something like: #ifdef CONFIG_64BIT depd,z %r26,63,32,%r1 1: ldw 0(%sr3,%r1), %r28 sub,<> %r28, %r25, %r0 2: stw %r24, 0(%sr3,%r1) #else 1: ldw 0(%sr3,%r26), %r28 sub,<> %r28, %r25, %r0 2: stw %r24, 0(%sr3,%r26) #endif The argument clipping in the current code would be removed. As a result, the branch to lws_compare_and_swap can be eliminated in the 64-bit path. It's my impression that the tightness of the loop for the compare/exchange operation is important. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Document LWS ABI. 2008-07-16 3:05 ` John David Anglin @ 2008-12-29 21:47 ` Helge Deller 0 siblings, 0 replies; 7+ messages in thread From: Helge Deller @ 2008-12-29 21:47 UTC (permalink / raw) To: John David Anglin, Carlos O'Donell, linux-parisc [-- Attachment #1: Type: text/plain, Size: 2806 bytes --] Carlos, Dave, This patch hasn't been finally discussed (and merged) yet. I've attached the last version of the patch from Carlos, that way it get archived in Kyle's Patchwork as well :-) My personal opinion is, that we should try to reduce the number of clobbered registers (which is in line with what Dave said below). Thread is here: http://marc.info/?t=121612540800004&r=1&w=2 Helge John David Anglin wrote: >> The question is "Are you OK with the existing ABI?" :-) > > No. As I understand it, r2 doesn't need to be clobbered because > glibc doesn't currently clobber it. So, using it in the LWS code > would cause an ABI break. That's one register back to userspace. > > I want to keep r19 and r27 for userspace so the PIC register doesn't > have to be saved and restored in the asm (linux-atomic.c is compiled > as PIC code). You can have r29. > > That leaves three free registers for the LWS code: r22, r23 and r29. > The LWS ABI has r1, r20-r26 and r28-r31. Userspace has two call-clobbered > registers free across the asm in PIC code, and three in non-PIC code. > That's enough to efficiently perform the error comparisons. > > The asm would be more efficient if the registers used for lws_mem, > lws_old and lws_new were not written to. This occurs only for the > call in the 32-bit runtime with a 64-bit kernel. As it stands, > the lws_mem, lws_old and lws_new arguments get reloaded every time > around the EAGAIN loop. This is the crucial code in the compare > and swap: > > /* The load and store could fail */ > 1: ldw 0(%sr3,%r26), %r28 > sub,<> %r28, %r25, %r0 > 2: stw %r24, 0(%sr3,%r26) > > The sub,<> instruction uses a 32-bit compare/subtract condition, so > the clipping of r25 isn't necessary. Similarly, the stw instruction > ignores the most significant 32-bits of r24. The value in r26 needs > clipping but you have three free registers, and it looks like r1 is > also free at this point in the code. You can deposit the least > significant 32-bits of r26 into a field of zeros in another register > in one instruction. > > It looks like lws_compare_and_swap64 and lws_compare_and_swap32 become > more or less functionally identical. The above would become something > like: > > #ifdef CONFIG_64BIT > depd,z %r26,63,32,%r1 > 1: ldw 0(%sr3,%r1), %r28 > sub,<> %r28, %r25, %r0 > 2: stw %r24, 0(%sr3,%r1) > #else > 1: ldw 0(%sr3,%r26), %r28 > sub,<> %r28, %r25, %r0 > 2: stw %r24, 0(%sr3,%r26) > #endif > > The argument clipping in the current code would be removed. As a result, > the branch to lws_compare_and_swap can be eliminated in the 64-bit path. > > It's my impression that the tightness of the loop for the compare/exchange > operation is important. > > Dave [-- Attachment #2: syscall.S.diff --] [-- Type: text/x-patch, Size: 2688 bytes --] [PARISC] Document LWS ABI and LWS cleanups. Document the LWS ABI including implementation notes for userspace, and comment cleanup. Remove extraneous .align 16 after lws_lock_start. Signed-off-by: Carlos O'Donell <carlos@systemhalted.org> Signed-off-by: Helge Deller <deller@gmx.de> diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S index 69b6eeb..3fc73ad 100644 --- a/arch/parisc/kernel/syscall.S +++ b/arch/parisc/kernel/syscall.S @@ -365,17 +365,51 @@ tracesys_sigexit: /********************************************************* - Light-weight-syscall code + 32/64-bit Light-Weight-Syscall ABI - r20 - lws number - r26,r25,r24,r23,r22 - Input registers - r28 - Function return register - r21 - Error code. + * - Indicates a hint for userspace inline asm + implementations. - Scracth: Any of the above that aren't being - currently used, including r1. + Syscall number (caller-saves) + - %r20 + * In asm clobber. - Return pointer: r31 (Not usable) + Argument registers (caller-saves) + - %r26, %r25, %r24, %r23, %r22 + * In asm input. + + Return registers (caller-saves) + - %r28 (return), %r21 (errno) + * In asm output. + + Caller-saves registers + - %r1, %r27, %r29 + - %r2 (return pointer) + - %r31 (ble link register) + * In asm clobber. + + Callee-saves registers + - %r3-%r18 + - %r30 (stack pointer) + * Not in asm clobber. + + If userspace is 32-bit: + Callee-saves registers + - %r19 (32-bit PIC register) + + Differences from 32-bit calling convention: + - Syscall number in %r20 + - Additional argument register %r22 (arg4) + - Callee-saves %r19. + + If userspace is 64-bit: + Callee-saves registers + - %r27 (64-bit PIC register) + + Differences from 64-bit calling convention: + - Syscall number in %r20 + - Additional argument register %r22 (arg4) + - Callee-saves %r27. Error codes returned by entry path: @@ -473,7 +507,8 @@ lws_compare_and_swap64: b,n lws_compare_and_swap #else /* If we are not a 64-bit kernel, then we don't - * implement having 64-bit input registers + * have 64-bit input registers, and calling + * the 64-bit LWS CAS returns ENOSYS. */ b,n lws_exit_nosys #endif @@ -635,12 +670,15 @@ END(sys_call_table64) /* All light-weight-syscall atomic operations will use this set of locks + + NOTE: The lws_lock_start symbol must be + at least 16-byte aligned for safe use + with ldcw. */ .section .data .align PAGE_SIZE ENTRY(lws_lock_start) /* lws locks */ - .align 16 .rept 16 /* Keep locks aligned at 16-bytes */ .word 1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-12-29 21:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-15 12:33 [PATCH] Document LWS ABI Carlos O'Donell 2008-07-15 17:32 ` Helge Deller 2008-07-15 19:51 ` Carlos O'Donell 2008-07-16 0:15 ` John David Anglin 2008-07-16 0:54 ` Carlos O'Donell 2008-07-16 3:05 ` John David Anglin 2008-12-29 21:47 ` Helge Deller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox