* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <201304021917.17659.vapier@gentoo.org> @ 2013-04-07 10:00 ` Michael Kerrisk (man-pages) 2013-04-07 13:55 ` Kyle McMartin 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2013-04-07 10:00 UTC (permalink / raw) To: Mike Frysinger Cc: linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc [Adding a few people to CC who may be able to help with Mike's doubts on PA-RISC; folks, if any of you could have a quick look at the parisc piece below, that would be helpful] Mike, On Wed, Apr 3, 2013 at 1:17 AM, Mike Frysinger <vapier@gentoo.org> wrote: > On Tuesday 02 April 2013 02:54:39 Michael Kerrisk (man-pages) wrote: >> On Mon, Apr 1, 2013 at 12:32 PM, Mike Frysinger wrote: >> > On Monday 01 April 2013 05:29:11 Michael Kerrisk (man-pages) wrote: >> >> On Mon, Apr 1, 2013 at 10:29 AM, Mike Frysinger wrote: >> >> > on a related topic, would it be useful to document the exact calling >> >> > convention for architecture system calls ? from time to time, i need >> >> > to reference this, and i inevitably turn to a variety of sources to >> >> > dig up the answer (the kernel itself, or strace, or qemu, or glibc, >> >> > or uClibc, or lss, or other random places). i would find it handy to >> >> > have all of these in a single location. >> >> >> >> Sounds like it would be useful to have that documented. Would you have >> >> a chance to write patches for that? >> > >> > should we do it in syscall(2) ? or a dedicated man page ? >> >> It's a little hard to say until I see the shape of what comes. Can you >> provide a rough per-syscall example or two of what you expect to >> document? (Don't write too concrete a patch yet, until I can get a >> handle on what you intend.) > > this renders nicely i think. it shows most of the stuff i'm interested in. > might be useful to add a dedicated section covering the clobbers in the > future. Thanks for that. It looks good to me, and I have applied. But it renders too wide (wherever possible, I try to ensure that everything renders inside 80 columns), so I have split into tables, one with "instruction, NR, ret" and another with the arguments (arg1 to arg7). Now, just to make 100% sure of your intention, the NR column would be better named "syscall #" (or similar), right? (I've made that change.) > --- a/man2/syscall.2 > +++ b/man2/syscall.2 > @@ -79,6 +79,35 @@ and an error code is stored in > .BR syscall () > first appeared in > 4BSD. > +.SS Architecture calling conventions > +Every architecture has its own way of invoking & passing arguments to the > +kernel. > +Note that the instruction listed below might not be the fastest or best way to > +transition to the kernel, so you might have to refer to the VDSO. Mike, any chance that I could interest you in writing a vdso(7) man page? I've felt the lack of such a page for a while (it need not be too long), but am not deep enough into the details to write it easily (I am not sure if you are). > +Also note that this doesn't cover the entire calling convention -- some > +architectures may indiscriminately clobber other registers not listed here. > +.if t \{\ > +.ft CW > +\} > +.TS > +l l l l l l l l l l l. > +arch/ABI insn NR ret arg1 arg2 arg3 arg4 arg5 arg6 arg7 > +_ > +arm/OABI swi NR; - a1 a1 a2 a3 a4 v1 v2 v3 > +arm/EABI swi 0x0; r7 r1 r1 r2 r3 r4 r5 r6 r7 > +bfin excpt 0x0; P0 R0 R0 R1 R2 R3 R4 R5 - > +i386 int $0x80; eax eax ebx ecx edx esi edi ebp - > +ia64 break 0x100000; r15 r10/r8 r11 r9 r10 r14 r15 r13 - > +.\" not sure about insn or NR > +.\" parisc ble 0x100(%%sr2, %%r0); - r28 r26 r25 r24 r23 r22 r21 - PA-RISC folks, are you able to confirm/correct the above? > +sparc/32 t 0x10; g1 o0 o0 o1 o2 o3 o4 o5 - > +sparc/64 t 0x6d; g1 o0 o0 o1 o2 o3 o4 o5 - > +x86_64 syscall; rax rax rdi rsi rdx r10 r8 r9 - > +.TE > +.if t \{\ > +.in > +.ft P > +\} > .SS Architecture-specific requirements > Each architecture ABI has its own requirements on how > system call arguments are passed to the kernel. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Author of "The Linux Programming Interface"; http://man7.org/tlpi/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-07 10:00 ` [PATCH] man2 : syscall.2 : document syscall calling conventions Michael Kerrisk (man-pages) @ 2013-04-07 13:55 ` Kyle McMartin 2013-04-07 14:56 ` James Bottomley [not found] ` <20130407135514.GW12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> 0 siblings, 2 replies; 19+ messages in thread From: Kyle McMartin @ 2013-04-07 13:55 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Mike Frysinger, linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Sun, Apr 07, 2013 at 12:00:50PM +0200, Michael Kerrisk (man-pages) wrote: > [Adding a few people to CC who may be able to help with Mike's doubts > on PA-RISC; folks, if any of you could have a quick look at the parisc > piece below, that would be helpful] > The syscall number is in %r20, everything else looks correct. The returned value is in %r28 and the args are %r26 through %r21. --Kyle > Mike, > > On Wed, Apr 3, 2013 at 1:17 AM, Mike Frysinger <vapier@gentoo.org> wrote: > > On Tuesday 02 April 2013 02:54:39 Michael Kerrisk (man-pages) wrote: > >> On Mon, Apr 1, 2013 at 12:32 PM, Mike Frysinger wrote: > >> > On Monday 01 April 2013 05:29:11 Michael Kerrisk (man-pages) wrote: > >> >> On Mon, Apr 1, 2013 at 10:29 AM, Mike Frysinger wrote: > >> >> > on a related topic, would it be useful to document the exact calling > >> >> > convention for architecture system calls ? from time to time, i need > >> >> > to reference this, and i inevitably turn to a variety of sources to > >> >> > dig up the answer (the kernel itself, or strace, or qemu, or glibc, > >> >> > or uClibc, or lss, or other random places). i would find it handy to > >> >> > have all of these in a single location. > >> >> > >> >> Sounds like it would be useful to have that documented. Would you have > >> >> a chance to write patches for that? > >> > > >> > should we do it in syscall(2) ? or a dedicated man page ? > >> > >> It's a little hard to say until I see the shape of what comes. Can you > >> provide a rough per-syscall example or two of what you expect to > >> document? (Don't write too concrete a patch yet, until I can get a > >> handle on what you intend.) > > > > this renders nicely i think. it shows most of the stuff i'm interested in. > > might be useful to add a dedicated section covering the clobbers in the > > future. > > Thanks for that. It looks good to me, and I have applied. But it > renders too wide (wherever possible, I try to ensure that everything > renders inside 80 columns), so I have split into tables, one with > "instruction, NR, ret" and another with the arguments (arg1 to arg7). > > Now, just to make 100% sure of your intention, the NR column would be > better named "syscall #" (or similar), right? (I've made that change.) > > > --- a/man2/syscall.2 > > +++ b/man2/syscall.2 > > @@ -79,6 +79,35 @@ and an error code is stored in > > .BR syscall () > > first appeared in > > 4BSD. > > +.SS Architecture calling conventions > > +Every architecture has its own way of invoking & passing arguments to the > > +kernel. > > +Note that the instruction listed below might not be the fastest or best way to > > +transition to the kernel, so you might have to refer to the VDSO. > > Mike, any chance that I could interest you in writing a vdso(7) man > page? I've felt the lack of such a page for a while (it need not be > too long), but am not deep enough into the details to write it easily > (I am not sure if you are). > > > +Also note that this doesn't cover the entire calling convention -- some > > +architectures may indiscriminately clobber other registers not listed here. > > +.if t \{\ > > +.ft CW > > +\} > > +.TS > > +l l l l l l l l l l l. > > +arch/ABI insn NR ret arg1 arg2 arg3 arg4 arg5 arg6 arg7 > > +_ > > +arm/OABI swi NR; - a1 a1 a2 a3 a4 v1 v2 v3 > > +arm/EABI swi 0x0; r7 r1 r1 r2 r3 r4 r5 r6 r7 > > +bfin excpt 0x0; P0 R0 R0 R1 R2 R3 R4 R5 - > > +i386 int $0x80; eax eax ebx ecx edx esi edi ebp - > > +ia64 break 0x100000; r15 r10/r8 r11 r9 r10 r14 r15 r13 - > > +.\" not sure about insn or NR > > +.\" parisc ble 0x100(%%sr2, %%r0); - r28 r26 r25 r24 r23 r22 r21 - > > PA-RISC folks, are you able to confirm/correct the above? > > > +sparc/32 t 0x10; g1 o0 o0 o1 o2 o3 o4 o5 - > > +sparc/64 t 0x6d; g1 o0 o0 o1 o2 o3 o4 o5 - > > +x86_64 syscall; rax rax rdi rsi rdx r10 r8 r9 - > > +.TE > > +.if t \{\ > > +.in > > +.ft P > > +\} > > .SS Architecture-specific requirements > > Each architecture ABI has its own requirements on how > > system call arguments are passed to the kernel. > > Cheers, > > Michael > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Author of "The Linux Programming Interface"; http://man7.org/tlpi/ > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-07 13:55 ` Kyle McMartin @ 2013-04-07 14:56 ` James Bottomley 2013-04-07 15:11 ` Kyle McMartin [not found] ` <20130407135514.GW12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> 1 sibling, 1 reply; 19+ messages in thread From: James Bottomley @ 2013-04-07 14:56 UTC (permalink / raw) To: Kyle McMartin Cc: Michael Kerrisk (man-pages), Mike Frysinger, linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Sun, 2013-04-07 at 09:55 -0400, Kyle McMartin wrote: > On Sun, Apr 07, 2013 at 12:00:50PM +0200, Michael Kerrisk (man-pages) wrote: > > [Adding a few people to CC who may be able to help with Mike's doubts > > on PA-RISC; folks, if any of you could have a quick look at the parisc > > piece below, that would be helpful] > > > > The syscall number is in %r20, everything else looks correct. The > returned value is in %r28 and the args are %r26 through %r21. Actually, that's not quite correct. on 64 bits it's arg1-8 are %r26-% r19 but on 32 the convention is that arg1-arg4 are %r26-%r23 and the rest on stack. We can also do register pair combining on 32 bits for a 64 bit argument. Our register use is documented in Documentation/parisc/registers James ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-07 14:56 ` James Bottomley @ 2013-04-07 15:11 ` Kyle McMartin [not found] ` <20130407151134.GX12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Kyle McMartin @ 2013-04-07 15:11 UTC (permalink / raw) To: James Bottomley Cc: Michael Kerrisk (man-pages), Mike Frysinger, linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Sun, Apr 07, 2013 at 07:56:49AM -0700, James Bottomley wrote: > Actually, that's not quite correct. on 64 bits it's arg1-8 are %r26-% > r19 but on 32 the convention is that arg1-arg4 are %r26-%r23 and the > rest on stack. We can also do register pair combining on 32 bits for a > 64 bit argument. I guess the confusion is whether you're writing this from the kernel side or the userspace side. The syscall instruction is called with six arg registers, but we fix it on entry to the kernel when we call into C. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20130407151134.GX12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>]
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <20130407151134.GX12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> @ 2013-04-07 15:38 ` James Bottomley 2013-04-08 9:18 ` Michael Kerrisk (man-pages) 1 sibling, 0 replies; 19+ messages in thread From: James Bottomley @ 2013-04-07 15:38 UTC (permalink / raw) To: Kyle McMartin Cc: Michael Kerrisk (man-pages), Mike Frysinger, linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc-u79uwXL29TY76Z2rM5mHXA On Sun, 2013-04-07 at 11:11 -0400, Kyle McMartin wrote: > On Sun, Apr 07, 2013 at 07:56:49AM -0700, James Bottomley wrote: > > Actually, that's not quite correct. on 64 bits it's arg1-8 are %r26-% > > r19 but on 32 the convention is that arg1-arg4 are %r26-%r23 and the > > rest on stack. We can also do register pair combining on 32 bits for a > > 64 bit argument. > > I guess the confusion is whether you're writing this from the kernel > side or the userspace side. The syscall instruction is called with six > arg registers, but we fix it on entry to the kernel when we call into C. Oh, right, syscall arguments, sorry didn't manage to extract the content from all the quotes. I was just thinking general ABI. The syscall arguments are all in arch/parisc/include/asm/unistd.h As Kyle says, we override the calling convention and define in-register arguments even on 32 bit (so %r26-%r21). We actually don't define _syscall6() yet, but we're ready for it. James -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <20130407151134.GX12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> 2013-04-07 15:38 ` James Bottomley @ 2013-04-08 9:18 ` Michael Kerrisk (man-pages) 1 sibling, 0 replies; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2013-04-08 9:18 UTC (permalink / raw) To: Kyle McMartin Cc: James Bottomley, Mike Frysinger, linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc-u79uwXL29TY76Z2rM5mHXA On Sun, Apr 7, 2013 at 5:11 PM, Kyle McMartin <kyle-pfcGkIkfWfAsA/PxXw9srA@public.gmane.org> wrote: > On Sun, Apr 07, 2013 at 07:56:49AM -0700, James Bottomley wrote: >> Actually, that's not quite correct. on 64 bits it's arg1-8 are %r26-% >> r19 but on 32 the convention is that arg1-arg4 are %r26-%r23 and the >> rest on stack. We can also do register pair combining on 32 bits for a >> 64 bit argument. > > I guess the confusion is whether you're writing this from the kernel > side or the userspace side. The syscall instruction is called with six > arg registers, but we fix it on entry to the kernel when we call into C.> -- > To unsubscribe from this list: send the line "unsubscribe linux-man" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Thanks, Kyle. -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20130407135514.GW12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>]
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <20130407135514.GW12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org> @ 2013-04-07 18:39 ` Mike Frysinger 2013-04-07 18:48 ` John David Anglin 0 siblings, 1 reply; 19+ messages in thread From: Mike Frysinger @ 2013-04-07 18:39 UTC (permalink / raw) To: Kyle McMartin Cc: Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: Text/Plain, Size: 884 bytes --] On Sunday 07 April 2013 09:55:14 Kyle McMartin wrote: > On Sun, Apr 07, 2013 at 12:00:50PM +0200, Michael Kerrisk (man-pages) wrote: > > [Adding a few people to CC who may be able to help with Mike's doubts > > on PA-RISC; folks, if any of you could have a quick look at the parisc > > piece below, that would be helpful] > > The syscall number is in %r20, everything else looks correct. The > returned value is in %r28 and the args are %r26 through %r21. just to be clear, the only insn you need is: ble 0x100(%sr2, %r0); the kernel docs say sr2 holds the kernel gateway page (so i guess 0x100 is a known offset into that). the docs don't mention r0 that i can see, so i'm guessing it's one of those "always 0" registers ? the sysdep code has an ldi call in the branch delay slot (i think), but all that seems to do is load r20 with the syscall nr. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-07 18:39 ` Mike Frysinger @ 2013-04-07 18:48 ` John David Anglin [not found] ` <BLU0-SMTP986B123D17DB8B88214F797C40-MsuGFMq8XAE@public.gmane.org> 2013-04-12 1:55 ` Mike Frysinger 0 siblings, 2 replies; 19+ messages in thread From: John David Anglin @ 2013-04-07 18:48 UTC (permalink / raw) To: Mike Frysinger Cc: Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > just to be clear, the only insn you need is: > ble 0x100(%sr2, %r0); > > the kernel docs say sr2 holds the kernel gateway page (so i guess > 0x100 is a > known offset into that). the docs don't mention r0 that i can see, > so i'm > guessing it's one of those "always 0" registers ? Yes. There is also an entry at offset 0xb0 for light-weight- syscalls. Currently, this implements an atomic CAS operation used for pthread support. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <BLU0-SMTP986B123D17DB8B88214F797C40-MsuGFMq8XAE@public.gmane.org>]
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <BLU0-SMTP986B123D17DB8B88214F797C40-MsuGFMq8XAE@public.gmane.org> @ 2013-04-08 9:20 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2013-04-08 9:20 UTC (permalink / raw) To: Mike Frysinger, Kyle McMartin Cc: John David Anglin, linux-man, Helge Deller, James E.J. Bottomley, linux-parisc-u79uwXL29TY76Z2rM5mHXA On Sun, Apr 7, 2013 at 8:48 PM, John David Anglin <dave.anglin-CzeTG9NwML0@public.gmane.org= > wrote: > On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > >> just to be clear, the only insn you need is: >> ble 0x100(%sr2, %r0); >> >> the kernel docs say sr2 holds the kernel gateway page (so i guess 0x= 100 is >> a >> known offset into that). the docs don't mention r0 that i can see, = so i'm >> guessing it's one of those "always 0" registers ? > > > Yes. There is also an entry at offset 0xb0 for light-weight-syscalls= =2E > Currently, > this implements an atomic CAS operation used for pthread support. Mike (and Kyle), =46or review, here are the tables as they now stand: =3D=3D=3D=3D=3D Architecture calling conventions Every architecture has its own way of invoking and passing argum= ents to the kernel. The details for various architectures are listed = in the two tables below. The first table lists the instruction used to transition to= kernel mode, (which might not be the fastest or best way to transition = to the kernel, so you might have to refer to the VDSO), the register = used to indicate the system call number, and the register used to retu= rn the system call result. arch/ABI instruction syscall # retval Notes =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 arm/OABI swi NR - a1 NR is syscal= l # arm/EABI swi 0x0 r7 r1 blackfin excpt 0x0 P0 R0 i386 int $0x80 eax eax ia64 break 0x100000 r15 r10/r8C parisc ble 0x100(%sr2, %r0) r20 r28 sparc/32 t 0x10 g1 o0 sparc/64 t 0x6d g1 o0 x86_64 syscall rax rax The second table shows the registers used to pass the system cal= l argu=E2=80=90 ments. arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80 arm/OABI a1 a2 a3 a4 v1 v2 v3 arm/EABI r1 r2 r3 r4 r5 r6 r7 blackfin R0 R1 R2 R3 R4 R5 - i386 ebx ecx edx esi edi ebp - ia64 r11 r9 r10 r14 r15 r13 - parisc r26 r25 r24 r23 r22 r21 - sparc/32 o0 o1 o2 o3 o4 o5 - sparc/64 o0 o1 o2 o3 o4 o5 - x86_64 rdi rsi rdx r10 r8 r9 - Note that these tables don't cover the entire calling conventi= on=E2=80=94some architectures may indiscriminately clobber other registers not= listed here. =3D=3D=3D=3D=3D Cheers, Michael -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-07 18:48 ` John David Anglin [not found] ` <BLU0-SMTP986B123D17DB8B88214F797C40-MsuGFMq8XAE@public.gmane.org> @ 2013-04-12 1:55 ` Mike Frysinger [not found] ` <201304112155.46349.vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org> 2013-04-12 14:01 ` Kyle McMartin 1 sibling, 2 replies; 19+ messages in thread From: Mike Frysinger @ 2013-04-12 1:55 UTC (permalink / raw) To: John David Anglin Cc: Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc [-- Attachment #1: Type: Text/Plain, Size: 686 bytes --] On Sunday 07 April 2013 14:48:42 John David Anglin wrote: > On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > > just to be clear, the only insn you need is: > > ble 0x100(%sr2, %r0); > > > > the kernel docs say sr2 holds the kernel gateway page (so i guess > > 0x100 is a > > known offset into that). the docs don't mention r0 that i can see, > > so i'm > > guessing it's one of those "always 0" registers ? > > Yes. There is also an entry at offset 0xb0 for light-weight- > syscalls. Currently, > this implements an atomic CAS operation used for pthread support. interesting. sounds like a poor man's vDSO. i'll document this the new vdso(7) man page. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <201304112155.46349.vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions [not found] ` <201304112155.46349.vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org> @ 2013-04-12 2:34 ` John David Anglin 2013-04-12 3:38 ` Mike Frysinger 0 siblings, 1 reply; 19+ messages in thread From: John David Anglin @ 2013-04-12 2:34 UTC (permalink / raw) To: Mike Frysinger Cc: Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc-u79uwXL29TY76Z2rM5mHXA On 11-Apr-13, at 9:55 PM, Mike Frysinger wrote: > On Sunday 07 April 2013 14:48:42 John David Anglin wrote: >> On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: >>> just to be clear, the only insn you need is: >>> ble 0x100(%sr2, %r0); >>> >>> the kernel docs say sr2 holds the kernel gateway page (so i guess >>> 0x100 is a >>> known offset into that). the docs don't mention r0 that i can see, >>> so i'm >>> guessing it's one of those "always 0" registers ? >> >> Yes. There is also an entry at offset 0xb0 for light-weight- >> syscalls. Currently, >> this implements an atomic CAS operation used for pthread support. > > interesting. sounds like a poor man's vDSO. i'll document this the > new > vdso(7) man page. Not exactly, the code runs on the gateway page which is in kernel space. The main reason for doing the operation in kernel space is to prevent processes from being preempted while executing in the lock region. In general, parisc processes are not preempted on the gateway page. There are some subtleties regarding fault handling. There is support in glibc and libgcc for these calls. The libgcc implementation in linux-atomic.c is very similar to that on arm. Dave -- John David Anglin dave.anglin-CzeTG9NwML0@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 2:34 ` John David Anglin @ 2013-04-12 3:38 ` Mike Frysinger 2013-04-12 4:45 ` James Bottomley 0 siblings, 1 reply; 19+ messages in thread From: Mike Frysinger @ 2013-04-12 3:38 UTC (permalink / raw) To: John David Anglin Cc: Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc [-- Attachment #1: Type: Text/Plain, Size: 2707 bytes --] On Thursday 11 April 2013 22:34:43 John David Anglin wrote: > On 11-Apr-13, at 9:55 PM, Mike Frysinger wrote: > > On Sunday 07 April 2013 14:48:42 John David Anglin wrote: > >> On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > >>> just to be clear, the only insn you need is: > >>> ble 0x100(%sr2, %r0); > >>> > >>> the kernel docs say sr2 holds the kernel gateway page (so i guess > >>> 0x100 is a > >>> known offset into that). the docs don't mention r0 that i can see, > >>> so i'm > >>> guessing it's one of those "always 0" registers ? > >> > >> Yes. There is also an entry at offset 0xb0 for light-weight- > >> syscalls. Currently, > >> this implements an atomic CAS operation used for pthread support. > > > > interesting. sounds like a poor man's vDSO. i'll document this the > > new > > vdso(7) man page. > > Not exactly, the code runs on the gateway page which is in kernel space. > The main reason for doing the operation in kernel space is to prevent > processes from being preempted while executing in the lock region. In > general, > parisc processes are not preempted on the gateway page. There are > some subtleties regarding fault handling. sure ... the Blackfin arch does a similar thing for providing fast atomic primitives to userspace since the ISA can't. what do you think of this section for vdso(7) ? i might have to split the "real" vdso arches from these others since there's a couple now (arm, bfin, parisc), and i think there might be more down the line (microblaze). .SS parisc (hppa) functions .\" See linux/arch/parisc/kernel/syscall.S .\" See linux/Documentation/parisc/registers The parisc port has a code page full of utility functions. Rather than use the normal ELF aux vector approach, it passes the address of the page to the process via the SR2 register. This is done to match the way HP-UX works. Since it's just a raw page of code, there is no ELF information for doing symbol lookups or versioning. Simply call into the appropriate offset via the branch instruction, e.g.: .br ble <offset>(%sr2, %r0) .if t \{\ .ft CW \} .TS l l. offset function _ 00b0 lws_entry 00e0 set_thread_pointer 0100 linux_gateway_entry (syscall) 0268 syscall_nosys 0274 tracesys 0324 tracesys_next 0368 tracesys_exit 03a0 tracesys_sigexit 03b8 lws_start 03dc lws_exit_nosys 03e0 lws_exit 03e4 lws_compare_and_swap64 03e8 lws_compare_and_swap 0404 cas_wouldblock 0410 cas_action .TE .if t \{\ .in .ft P \} > There is support in glibc and libgcc for these calls. The libgcc > implementation > in linux-atomic.c is very similar to that on arm. interesting. another arch to add :). -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 3:38 ` Mike Frysinger @ 2013-04-12 4:45 ` James Bottomley 2013-04-12 12:17 ` John David Anglin 2013-04-12 18:45 ` Mike Frysinger 0 siblings, 2 replies; 19+ messages in thread From: James Bottomley @ 2013-04-12 4:45 UTC (permalink / raw) To: Mike Frysinger Cc: John David Anglin, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: > On Thursday 11 April 2013 22:34:43 John David Anglin wrote: > > On 11-Apr-13, at 9:55 PM, Mike Frysinger wrote: > > > On Sunday 07 April 2013 14:48:42 John David Anglin wrote: > > >> On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > > >>> just to be clear, the only insn you need is: > > >>> ble 0x100(%sr2, %r0); > > >>> > > >>> the kernel docs say sr2 holds the kernel gateway page (so i guess > > >>> 0x100 is a > > >>> known offset into that). the docs don't mention r0 that i can see, > > >>> so i'm > > >>> guessing it's one of those "always 0" registers ? > > >> > > >> Yes. There is also an entry at offset 0xb0 for light-weight- > > >> syscalls. Currently, > > >> this implements an atomic CAS operation used for pthread support. > > > > > > interesting. sounds like a poor man's vDSO. i'll document this the > > > new > > > vdso(7) man page. > > > > Not exactly, the code runs on the gateway page which is in kernel space. > > The main reason for doing the operation in kernel space is to prevent > > processes from being preempted while executing in the lock region. In > > general, > > parisc processes are not preempted on the gateway page. There are > > some subtleties regarding fault handling. > > sure ... the Blackfin arch does a similar thing for providing fast atomic > primitives to userspace since the ISA can't. > > what do you think of this section for vdso(7) ? i might have to split the > "real" vdso arches from these others since there's a couple now (arm, bfin, > parisc), and i think there might be more down the line (microblaze). I've got to say, I really don't think this can be classified as a vdso. For a vdso, the kernel exports an ELF object that can be linked dynamically into any elf binary requiring it. The ELF section information provides full details and so vdso entries can be called by symbol. In the parisc gateway page implementation, we have a set of "hidden" primitives which the executable must know how to call (no self description like a vdso). This mechanism is identical to the original intent of the x86 int <n> instruction (an instruction that traps into the kernel and performs some primitive action but to use it, you have to know which function corresponds to which value of <n>). James > .SS parisc (hppa) functions > .\" See linux/arch/parisc/kernel/syscall.S > .\" See linux/Documentation/parisc/registers > The parisc port has a code page full of utility functions. > Rather than use the normal ELF aux vector approach, it passes the address of > the page to the process via the SR2 register. > This is done to match the way HP-UX works. > > Since it's just a raw page of code, there is no ELF information for doing > symbol lookups or versioning. > Simply call into the appropriate offset via the branch instruction, e.g.: > .br > ble <offset>(%sr2, %r0) > .if t \{\ > .ft CW > \} > .TS > l l. > offset function > _ > 00b0 lws_entry > 00e0 set_thread_pointer > 0100 linux_gateway_entry (syscall) > 0268 syscall_nosys > 0274 tracesys > 0324 tracesys_next > 0368 tracesys_exit > 03a0 tracesys_sigexit > 03b8 lws_start > 03dc lws_exit_nosys > 03e0 lws_exit > 03e4 lws_compare_and_swap64 > 03e8 lws_compare_and_swap > 0404 cas_wouldblock > 0410 cas_action > .TE > .if t \{\ > .in > .ft P > \} > > > There is support in glibc and libgcc for these calls. The libgcc > > implementation > > in linux-atomic.c is very similar to that on arm. > > interesting. another arch to add :). > -mike ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 4:45 ` James Bottomley @ 2013-04-12 12:17 ` John David Anglin 2013-04-12 18:45 ` Mike Frysinger 1 sibling, 0 replies; 19+ messages in thread From: John David Anglin @ 2013-04-12 12:17 UTC (permalink / raw) To: James Bottomley Cc: Mike Frysinger, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On 12-Apr-13, at 12:45 AM, James Bottomley wrote: > On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: >> On Thursday 11 April 2013 22:34:43 John David Anglin wrote: >>> On 11-Apr-13, at 9:55 PM, Mike Frysinger wrote: >>>> On Sunday 07 April 2013 14:48:42 John David Anglin wrote: >>>>> On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: >>>>>> just to be clear, the only insn you need is: >>>>>> ble 0x100(%sr2, %r0); >>>>>> >>>>>> the kernel docs say sr2 holds the kernel gateway page (so i guess >>>>>> 0x100 is a >>>>>> known offset into that). the docs don't mention r0 that i can >>>>>> see, >>>>>> so i'm >>>>>> guessing it's one of those "always 0" registers ? >>>>> >>>>> Yes. There is also an entry at offset 0xb0 for light-weight- >>>>> syscalls. Currently, >>>>> this implements an atomic CAS operation used for pthread support. >>>> >>>> interesting. sounds like a poor man's vDSO. i'll document this >>>> the >>>> new >>>> vdso(7) man page. >>> >>> Not exactly, the code runs on the gateway page which is in kernel >>> space. >>> The main reason for doing the operation in kernel space is to >>> prevent >>> processes from being preempted while executing in the lock >>> region. In >>> general, >>> parisc processes are not preempted on the gateway page. There are >>> some subtleties regarding fault handling. >> >> sure ... the Blackfin arch does a similar thing for providing fast >> atomic >> primitives to userspace since the ISA can't. >> >> what do you think of this section for vdso(7) ? i might have to >> split the >> "real" vdso arches from these others since there's a couple now >> (arm, bfin, >> parisc), and i think there might be more down the line (microblaze). > > I've got to say, I really don't think this can be classified as a > vdso. > For a vdso, the kernel exports an ELF object that can be linked > dynamically into any elf binary requiring it. The ELF section > information provides full details and so vdso entries can be called by > symbol. > > In the parisc gateway page implementation, we have a set of "hidden" > primitives which the executable must know how to call (no self > description like a vdso). This mechanism is identical to the original > intent of the x86 int <n> instruction (an instruction that traps into > the kernel and performs some primitive action but to use it, you > have to > know which function corresponds to which value of <n>). I agree with James. There is no ELF object exported to userspace. The content of the gateway page is hidden. The data structures used for the locks are in the kernel itself. Access is via a special branch instruction rather than a break/trap instruction. > > James > > >> .SS parisc (hppa) functions >> .\" See linux/arch/parisc/kernel/syscall.S >> .\" See linux/Documentation/parisc/registers >> The parisc port has a code page full of utility functions. >> Rather than use the normal ELF aux vector approach, it passes the >> address of >> the page to the process via the SR2 register. >> This is done to match the way HP-UX works. >> >> Since it's just a raw page of code, there is no ELF information for >> doing >> symbol lookups or versioning. >> Simply call into the appropriate offset via the branch instruction, >> e.g.: >> .br >> ble <offset>(%sr2, %r0) >> .if t \{\ >> .ft CW >> \} >> .TS >> l l. >> offset function >> _ >> 00b0 lws_entry >> 00e0 set_thread_pointer >> 0100 linux_gateway_entry (syscall) >> 0268 syscall_nosys >> 0274 tracesys >> 0324 tracesys_next >> 0368 tracesys_exit >> 03a0 tracesys_sigexit >> 03b8 lws_start >> 03dc lws_exit_nosys >> 03e0 lws_exit >> 03e4 lws_compare_and_swap64 >> 03e8 lws_compare_and_swap >> 0404 cas_wouldblock >> 0410 cas_action >> .TE >> .if t \{\ >> .in >> .ft P >> \} >> >>> There is support in glibc and libgcc for these calls. The libgcc >>> implementation >>> in linux-atomic.c is very similar to that on arm. >> >> interesting. another arch to add :). >> -mike > > > -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 4:45 ` James Bottomley 2013-04-12 12:17 ` John David Anglin @ 2013-04-12 18:45 ` Mike Frysinger 2013-04-12 19:14 ` James Bottomley 1 sibling, 1 reply; 19+ messages in thread From: Mike Frysinger @ 2013-04-12 18:45 UTC (permalink / raw) To: James Bottomley Cc: John David Anglin, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc [-- Attachment #1: Type: Text/Plain, Size: 3533 bytes --] On Friday 12 April 2013 00:45:12 James Bottomley wrote: > On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: > > On Thursday 11 April 2013 22:34:43 John David Anglin wrote: > > > On 11-Apr-13, at 9:55 PM, Mike Frysinger wrote: > > > > On Sunday 07 April 2013 14:48:42 John David Anglin wrote: > > > >> On 7-Apr-13, at 2:39 PM, Mike Frysinger wrote: > > > >>> just to be clear, the only insn you need is: > > > >>> ble 0x100(%sr2, %r0); > > > >>> > > > >>> the kernel docs say sr2 holds the kernel gateway page (so i guess > > > >>> 0x100 is a > > > >>> known offset into that). the docs don't mention r0 that i can see, > > > >>> so i'm > > > >>> guessing it's one of those "always 0" registers ? > > > >> > > > >> Yes. There is also an entry at offset 0xb0 for light-weight- > > > >> syscalls. Currently, > > > >> this implements an atomic CAS operation used for pthread support. > > > > > > > > interesting. sounds like a poor man's vDSO. i'll document this the > > > > new > > > > vdso(7) man page. > > > > > > Not exactly, the code runs on the gateway page which is in kernel > > > space. The main reason for doing the operation in kernel space is to > > > prevent processes from being preempted while executing in the lock > > > region. In general, > > > parisc processes are not preempted on the gateway page. There are > > > some subtleties regarding fault handling. > > > > sure ... the Blackfin arch does a similar thing for providing fast atomic > > primitives to userspace since the ISA can't. > > > > what do you think of this section for vdso(7) ? i might have to split > > the "real" vdso arches from these others since there's a couple now > > (arm, bfin, parisc), and i think there might be more down the line > > (microblaze). > > I've got to say, I really don't think this can be classified as a vdso. > For a vdso, the kernel exports an ELF object that can be linked > dynamically into any elf binary requiring it. The ELF section > information provides full details and so vdso entries can be called by > symbol. strictly speaking, sure, a vDSO is only a vDSO if it's an ELF (since the acronym is literally "virtual dynamic shared object"). however, i see the vdso as being a bit more of a flexible concept -- it's a place of shared code that the kernel manages and exports for all userspace processes. fundamentally, the point of the vDSO is to provide services to greatly speed up userspace. in that regard, these mapped pages are exactly like vDSOs. thus i think it's appropriate to document these "fixed code" regions that many arches export (ARM, Blackfin, Itanium, Microblaze, PA-RISC) in the same man page as the vdso. especially since (currently) arches do one or the other, but not both. > In the parisc gateway page implementation, we have a set of "hidden" > primitives which the executable must know how to call (no self > description like a vdso). This mechanism is identical to the original > intent of the x86 int <n> instruction (an instruction that traps into > the kernel and performs some primitive action but to use it, you have to > know which function corresponds to which value of <n>). would it be useful to document all of them ? or just the ones that userspace actively uses (like syscall/cas) ? or should all of this be recorded in the kernel's Documentation/parisc/ subdir and just have the man page refer people there (like it does for ARM & Blackfin currently) ? -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 18:45 ` Mike Frysinger @ 2013-04-12 19:14 ` James Bottomley 2013-04-12 19:46 ` Mike Frysinger 0 siblings, 1 reply; 19+ messages in thread From: James Bottomley @ 2013-04-12 19:14 UTC (permalink / raw) To: Mike Frysinger Cc: John David Anglin, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Fri, 2013-04-12 at 14:45 -0400, Mike Frysinger wrote: > On Friday 12 April 2013 00:45:12 James Bottomley wrote: > > On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: > > > what do you think of this section for vdso(7) ? i might have to split > > > the "real" vdso arches from these others since there's a couple now > > > (arm, bfin, parisc), and i think there might be more down the line > > > (microblaze). > > > > I've got to say, I really don't think this can be classified as a vdso. > > For a vdso, the kernel exports an ELF object that can be linked > > dynamically into any elf binary requiring it. The ELF section > > information provides full details and so vdso entries can be called by > > symbol. > > strictly speaking, sure, a vDSO is only a vDSO if it's an ELF (since the > acronym is literally "virtual dynamic shared object"). however, i see the > vdso as being a bit more of a flexible concept -- it's a place of shared code > that the kernel manages and exports for all userspace processes. > fundamentally, the point of the vDSO is to provide services to greatly speed > up userspace. in that regard, these mapped pages are exactly like vDSOs. I don't entirely understand this classification. If the kernel<->user gateway becomes classified as a vdso, that covers our syscall interface on every archtecture. There's now no distinction between a vdso (which may not even move to kernel mode) and a syscall. I think the difference is that a syscall is a specific call to a known kernel routine by number and it involves a transition to kernel mode. A vdso is an exported link object containing certain functions which may or may not cause a trap to kernel mode when executed. The distinction is how you do the call. For syscalls, you have to know the number and the arguments. For vdso you just have to know the symbol (and obviously, the prototype for C code) and the kernel supplies the implementation direct to the userspace binary. > thus i think it's appropriate to document these "fixed code" regions that many > arches export (ARM, Blackfin, Itanium, Microblaze, PA-RISC) in the same man > page as the vdso. especially since (currently) arches do one or the other, > but not both. I really see these as a type of lightweight syscall. You use the syscall prototype (call by number with known arguments) but the call may not necessarily transition to kernel mode proper to handle the function. > > In the parisc gateway page implementation, we have a set of "hidden" > > primitives which the executable must know how to call (no self > > description like a vdso). This mechanism is identical to the original > > intent of the x86 int <n> instruction (an instruction that traps into > > the kernel and performs some primitive action but to use it, you have to > > know which function corresponds to which value of <n>). > > would it be useful to document all of them ? or just the ones that userspace > actively uses (like syscall/cas) ? or should all of this be recorded in the > kernel's Documentation/parisc/ subdir and just have the man page refer people > there (like it does for ARM & Blackfin currently) ? I'm not sure. For x86 they're in include/asm/traps.h. I think the only ones we really use are int3 for breakpoint, int4 for overflow and int80 for legacy syscall. James ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 19:14 ` James Bottomley @ 2013-04-12 19:46 ` Mike Frysinger 2013-04-12 20:25 ` James Bottomley 0 siblings, 1 reply; 19+ messages in thread From: Mike Frysinger @ 2013-04-12 19:46 UTC (permalink / raw) To: James Bottomley Cc: John David Anglin, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc [-- Attachment #1: Type: Text/Plain, Size: 5259 bytes --] On Friday 12 April 2013 15:14:47 James Bottomley wrote: > On Fri, 2013-04-12 at 14:45 -0400, Mike Frysinger wrote: > > On Friday 12 April 2013 00:45:12 James Bottomley wrote: > > > On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: > > > > what do you think of this section for vdso(7) ? i might have to > > > > split the "real" vdso arches from these others since there's a > > > > couple now (arm, bfin, parisc), and i think there might be more down > > > > the line (microblaze). > > > > > > I've got to say, I really don't think this can be classified as a vdso. > > > For a vdso, the kernel exports an ELF object that can be linked > > > dynamically into any elf binary requiring it. The ELF section > > > information provides full details and so vdso entries can be called by > > > symbol. > > > > strictly speaking, sure, a vDSO is only a vDSO if it's an ELF (since the > > acronym is literally "virtual dynamic shared object"). however, i see > > the vdso as being a bit more of a flexible concept -- it's a place of > > shared code that the kernel manages and exports for all userspace > > processes. fundamentally, the point of the vDSO is to provide services > > to greatly speed up userspace. in that regard, these mapped pages are > > exactly like vDSOs. > > I don't entirely understand this classification. If the kernel<->user > gateway becomes classified as a vdso, that covers our syscall interface > on every archtecture. There's now no distinction between a vdso (which > may not even move to kernel mode) and a syscall. > > I think the difference is that a syscall is a specific call to a known > kernel routine by number and it involves a transition to kernel mode. A > vdso is an exported link object containing certain functions which may > or may not cause a trap to kernel mode when executed. The distinction > is how you do the call. For syscalls, you have to know the number and > the arguments. For vdso you just have to know the symbol (and > obviously, the prototype for C code) and the kernel supplies the > implementation direct to the userspace binary. i'm not fully versed in the parisc linux gateway page or how the architecture is handling things, so i could be completely off here. from reading the source code, it *looked* like it was just a page of utility funcs that userspace branches to without changing privilege modes or going through the full syscall routines. so i'm saying the gateway page itself can be thought of in the same vein as a vDSO. it's a black box with entry points that provide light weight services to userspace. sometimes it ends up triggering a full syscall, sometimes it doesn't (just like a vDSO). > > thus i think it's appropriate to document these "fixed code" regions that > > many arches export (ARM, Blackfin, Itanium, Microblaze, PA-RISC) in the > > same man page as the vdso. especially since (currently) arches do one > > or the other, but not both. > > I really see these as a type of lightweight syscall. You use the > syscall prototype (call by number with known arguments) but the call may > not necessarily transition to kernel mode proper to handle the function. if you think of the vdso in a very strict light (it's exactly an ELF that the kernel automatically maps into every process's address space), then i guess you can only classify these as lightweight syscalls (where the address/offset is the "syscall #"). i see vdso as being a more flexible concept than that -- if it's code mapped into a process's address space and provides useful lightweight services that are meant to be used specifically in lieu of syscall(), then it's vdso-like and should be in the vdso(7) man page. it has a lot more in common imo with a vdso than it does with an actual syscall. i certainly think vdso(7) is more appropriate for these regions than syscall(2) or syscalls(2). > > > In the parisc gateway page implementation, we have a set of "hidden" > > > primitives which the executable must know how to call (no self > > > description like a vdso). This mechanism is identical to the original > > > intent of the x86 int <n> instruction (an instruction that traps into > > > the kernel and performs some primitive action but to use it, you have > > > to know which function corresponds to which value of <n>). > > > > would it be useful to document all of them ? or just the ones that > > userspace actively uses (like syscall/cas) ? or should all of this be > > recorded in the kernel's Documentation/parisc/ subdir and just have the > > man page refer people there (like it does for ARM & Blackfin currently) > > ? > > I'm not sure. For x86 they're in include/asm/traps.h. I think the only > ones we really use are int3 for breakpoint, int4 for overflow and int80 > for legacy syscall. hmm, i wasn't even considering the other arch-specific services offered by e.g. software interrupts. i don't think those belong in vdso(7) as they don't confer any of the lightweight advantages the vdso is designed to bring, but it might be useful to document these somewhere. they're also not as common for people to encounter as a vdso ... -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 19:46 ` Mike Frysinger @ 2013-04-12 20:25 ` James Bottomley 0 siblings, 0 replies; 19+ messages in thread From: James Bottomley @ 2013-04-12 20:25 UTC (permalink / raw) To: Mike Frysinger Cc: John David Anglin, Kyle McMartin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Fri, 2013-04-12 at 15:46 -0400, Mike Frysinger wrote: > On Friday 12 April 2013 15:14:47 James Bottomley wrote: > > On Fri, 2013-04-12 at 14:45 -0400, Mike Frysinger wrote: > > > On Friday 12 April 2013 00:45:12 James Bottomley wrote: > > > > On Thu, 2013-04-11 at 23:38 -0400, Mike Frysinger wrote: > > > > > what do you think of this section for vdso(7) ? i might have to > > > > > split the "real" vdso arches from these others since there's a > > > > > couple now (arm, bfin, parisc), and i think there might be more down > > > > > the line (microblaze). > > > > > > > > I've got to say, I really don't think this can be classified as a vdso. > > > > For a vdso, the kernel exports an ELF object that can be linked > > > > dynamically into any elf binary requiring it. The ELF section > > > > information provides full details and so vdso entries can be called by > > > > symbol. > > > > > > strictly speaking, sure, a vDSO is only a vDSO if it's an ELF (since the > > > acronym is literally "virtual dynamic shared object"). however, i see > > > the vdso as being a bit more of a flexible concept -- it's a place of > > > shared code that the kernel manages and exports for all userspace > > > processes. fundamentally, the point of the vDSO is to provide services > > > to greatly speed up userspace. in that regard, these mapped pages are > > > exactly like vDSOs. > > > > I don't entirely understand this classification. If the kernel<->user > > gateway becomes classified as a vdso, that covers our syscall interface > > on every archtecture. There's now no distinction between a vdso (which > > may not even move to kernel mode) and a syscall. > > > > I think the difference is that a syscall is a specific call to a known > > kernel routine by number and it involves a transition to kernel mode. A > > vdso is an exported link object containing certain functions which may > > or may not cause a trap to kernel mode when executed. The distinction > > is how you do the call. For syscalls, you have to know the number and > > the arguments. For vdso you just have to know the symbol (and > > obviously, the prototype for C code) and the kernel supplies the > > implementation direct to the userspace binary. > > i'm not fully versed in the parisc linux gateway page or how the architecture > is handling things, so i could be completely off here. from reading the source > code, it *looked* like it was just a page of utility funcs that userspace > branches to without changing privilege modes or going through the full syscall > routines. Oh, if that's the misunderstanding, then the gateway page is "special". It actually has PAGE_GATEWAY bits set (this is linux terminology; in parisc terminology it's Execute, promote to PL0)in the page map. So anything executing on this page executes with kernel level privilege (there's more to it than that: to have this happen, you also have to use a branch with a ,gate completer to activate the privilege promotion). The upshot is that everything that runs on the gateway page runs at kernel privilege but with the current user process address space (although you have access to kernel space via %sr2). For the 0x100 syscall entry, we redo the space registers to point to the kernel address space (preserving the user address space in %sr3), move to wide mode if required, save the user registers and branch into the kernel syscall entry point. For all the other functions, we execute at kernel privilege but don't flip address spaces. The basic upshot of this is that these code snippets are executed atomically (because the kernel can't be pre-empted) and they may perform architecturally forbidden (to PL3) operations (like setting control registers). James ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] man2 : syscall.2 : document syscall calling conventions 2013-04-12 1:55 ` Mike Frysinger [not found] ` <201304112155.46349.vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org> @ 2013-04-12 14:01 ` Kyle McMartin 1 sibling, 0 replies; 19+ messages in thread From: Kyle McMartin @ 2013-04-12 14:01 UTC (permalink / raw) To: Mike Frysinger Cc: John David Anglin, Michael Kerrisk (man-pages), linux-man, Kyle McMartin, Helge Deller, James E.J. Bottomley, linux-parisc On Thu, Apr 11, 2013 at 09:55:43PM -0400, Mike Frysinger wrote: > interesting. sounds like a poor man's vDSO. i'll document this the new > vdso(7) man page. > -mike fwiw ia64 does basically the same thing for a subset of syscalls (fsys.c) --Kyle ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-04-12 20:25 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1364361092-5948-1-git-send-email-ch0.han@lge.com>
[not found] ` <201304010632.41520.vapier@gentoo.org>
[not found] ` <CAKgNAkgG2kdCC1tyZQkYU7O_nP7RB8VoCmx6eb8FcudU1s6RgA@mail.gmail.com>
[not found] ` <201304021917.17659.vapier@gentoo.org>
2013-04-07 10:00 ` [PATCH] man2 : syscall.2 : document syscall calling conventions Michael Kerrisk (man-pages)
2013-04-07 13:55 ` Kyle McMartin
2013-04-07 14:56 ` James Bottomley
2013-04-07 15:11 ` Kyle McMartin
[not found] ` <20130407151134.GX12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2013-04-07 15:38 ` James Bottomley
2013-04-08 9:18 ` Michael Kerrisk (man-pages)
[not found] ` <20130407135514.GW12938-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2013-04-07 18:39 ` Mike Frysinger
2013-04-07 18:48 ` John David Anglin
[not found] ` <BLU0-SMTP986B123D17DB8B88214F797C40-MsuGFMq8XAE@public.gmane.org>
2013-04-08 9:20 ` Michael Kerrisk (man-pages)
2013-04-12 1:55 ` Mike Frysinger
[not found] ` <201304112155.46349.vapier-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org>
2013-04-12 2:34 ` John David Anglin
2013-04-12 3:38 ` Mike Frysinger
2013-04-12 4:45 ` James Bottomley
2013-04-12 12:17 ` John David Anglin
2013-04-12 18:45 ` Mike Frysinger
2013-04-12 19:14 ` James Bottomley
2013-04-12 19:46 ` Mike Frysinger
2013-04-12 20:25 ` James Bottomley
2013-04-12 14:01 ` Kyle McMartin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox