linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]         ` <20090925105632.GG12824-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2009-09-29 18:05           ` Sukadev Bhattiprolu
       [not found]             ` <20090929180537.GD4625-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Sukadev Bhattiprolu @ 2009-09-29 18:05 UTC (permalink / raw)
  To: Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	"Eric W. Biederman" <ebiederm@
  Cc: linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

Ccing kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org and linux-api on this thread.

Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
| > It will very likely break ia64, which defines CONFIG_HAVE_ARCH_TRACEHOOK and
| > already has sys_clone2().
| 
| -> sys_clone_ext() ?
| 
| Louis

How about spelling out extended and calling it clone_extended() ?

The other options I can think of are clone_with_pids() and clone3().

Thanks for your feedback.

Sukadev
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]             ` <20090929180537.GD4625-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-09-29 18:40               ` Roland McGrath
       [not found]                 ` <20090929184023.532DF34-nL1rrgvulkc2UH6IwYuUx0EOCMrvLtNR@public.gmane.org>
  2009-09-29 21:58               ` Oren Laadan
  1 sibling, 1 reply; 15+ messages in thread
From: Roland McGrath @ 2009-09-29 18:40 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, mingo-X9Un+BFzKDI,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Alexey Dobriyan,
	Pavel Emelyanov, linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

Why add a new syscall at all instead of just using a new CLONE_* flag to
indicate that the argument layout is different?
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                 ` <20090929184023.532DF34-nL1rrgvulkc2UH6IwYuUx0EOCMrvLtNR@public.gmane.org>
@ 2009-09-29 18:44                   ` H. Peter Anvin
       [not found]                     ` <4AC255A4.4030002-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2009-09-29 18:44 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Sukadev Bhattiprolu, Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

On 09/29/2009 11:40 AM, Roland McGrath wrote:
> Why add a new syscall at all instead of just using a new CLONE_* flag to
> indicate that the argument layout is different?

What an absolutely atrociously bad idea.

We already have a syscall layer which is painful to thunk in places, and
this would make it much worse.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                     ` <4AC255A4.4030002-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
@ 2009-09-29 19:02                       ` Arjan van de Ven
       [not found]                         ` <20090929210207.247b94df-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  2009-09-29 20:00                         ` H. Peter Anvin
  0 siblings, 2 replies; 15+ messages in thread
From: Arjan van de Ven @ 2009-09-29 19:02 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, linux-api-u79uwXL29TY76Z2rM5mHXA, Containers,
	Nathan Lynch, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman, kosaki.motohiro-+CUm20s59erQFUHtdCDX3A,
	mingo-X9Un+BFzKDI, Sukadev Bhattiprolu,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Alexey Dobriyan,
	Roland McGrath, Pavel Emelyanov

On Tue, 29 Sep 2009 11:44:52 -0700
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> wrote:

> On 09/29/2009 11:40 AM, Roland McGrath wrote:
> > Why add a new syscall at all instead of just using a new CLONE_*
> > flag to indicate that the argument layout is different?
> 
> What an absolutely atrociously bad idea.
> 
> We already have a syscall layer which is painful to thunk in places,
> and this would make it much worse.
> 

syscalls are cheap as well.
cheaper than decades of dealing with such multiplexer mess ;/


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                         ` <20090929210207.247b94df-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2009-09-29 19:10                           ` Linus Torvalds
       [not found]                             ` <alpine.LFD.2.01.0909291207410.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-09-29 19:10 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: H. Peter Anvin, Roland McGrath, Sukadev Bhattiprolu,
	Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A



On Tue, 29 Sep 2009, Arjan van de Ven wrote:
> > 
> > We already have a syscall layer which is painful to thunk in places,
> > and this would make it much worse.
> 
> syscalls are cheap as well.
> cheaper than decades of dealing with such multiplexer mess ;/

Well, I'd agree, except the clone flags really _are_ about multiplexer 
issues, and the new flag woudln't really change anything. 

If the new system call actually had appreciably separate code-paths, I'd 
buy the "multiplexer" argument. But it doesn't really. It's going to call 
down to the same basic clone functionality, and the core clone code ends 
up de-multiplexing the cases anyway.

So this would not at all be like the socket calls (to pick the traditional 
Linux system call multiplexing example) in that sense.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
  2009-09-29 19:02                       ` Arjan van de Ven
       [not found]                         ` <20090929210207.247b94df-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2009-09-29 20:00                         ` H. Peter Anvin
  1 sibling, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2009-09-29 20:00 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Roland McGrath, Sukadev Bhattiprolu, Arnd Bergmann, Containers,
	Nathan Lynch, linux-kernel, Eric W. Biederman, mingo, torvalds,
	Alexey Dobriyan, Pavel Emelyanov, linux-api, kosaki.motohiro

On 09/29/2009 12:02 PM, Arjan van de Ven wrote:
> On Tue, 29 Sep 2009 11:44:52 -0700
> "H. Peter Anvin" <hpa@zytor.com> wrote:
> 
>> On 09/29/2009 11:40 AM, Roland McGrath wrote:
>>> Why add a new syscall at all instead of just using a new CLONE_*
>>> flag to indicate that the argument layout is different?
>>
>> What an absolutely atrociously bad idea.
>>
>> We already have a syscall layer which is painful to thunk in places,
>> and this would make it much worse.
>>
> syscalls are cheap as well.
> cheaper than decades of dealing with such multiplexer mess ;/
> 

It really comes down to wanting all the dispatch to happen in one
central place.

	-hpa

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                             ` <alpine.LFD.2.01.0909291207410.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-09-29 20:02                               ` H. Peter Anvin
       [not found]                                 ` <4AC267C7.4070300-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2009-09-29 20:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arjan van de Ven, Roland McGrath, Sukadev Bhattiprolu,
	Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

On 09/29/2009 12:10 PM, Linus Torvalds wrote:
> 
> On Tue, 29 Sep 2009, Arjan van de Ven wrote:
>>>
>>> We already have a syscall layer which is painful to thunk in places,
>>> and this would make it much worse.
>>
>> syscalls are cheap as well.
>> cheaper than decades of dealing with such multiplexer mess ;/
> 
> Well, I'd agree, except the clone flags really _are_ about multiplexer 
> issues, and the new flag woudln't really change anything. 
> 
> If the new system call actually had appreciably separate code-paths, I'd 
> buy the "multiplexer" argument. But it doesn't really. It's going to call 
> down to the same basic clone functionality, and the core clone code ends 
> up de-multiplexing the cases anyway.
> 
> So this would not at all be like the socket calls (to pick the traditional 
> Linux system call multiplexing example) in that sense.
> 

That's not the main issue here, though.  The main issue is that the
prototype of the function now depends on one of its arguments, which is
absolute hell for anything that needs to thunk arguments in a systematic
way, which we have to do on several architectures, and which would be
useful to be able to do for others, too.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]             ` <20090929180537.GD4625-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2009-09-29 18:40               ` Roland McGrath
@ 2009-09-29 21:58               ` Oren Laadan
  1 sibling, 0 replies; 15+ messages in thread
From: Oren Laadan @ 2009-09-29 21:58 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, mingo-X9Un+BFzKDI,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Alexey Dobriyan,
	Pavel Emelyanov, linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A



Sukadev Bhattiprolu wrote:
> Ccing kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org and linux-api on this thread.
> 
> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
> | > It will very likely break ia64, which defines CONFIG_HAVE_ARCH_TRACEHOOK and
> | > already has sys_clone2().
> | 
> | -> sys_clone_ext() ?
> | 
> | Louis
> 
> How about spelling out extended and calling it clone_extended() ?
> 
> The other options I can think of are clone_with_pids() and clone3().

I like clone3(), or clone_new() ?

or even better -- how about xerox()  :p

Oren.

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                 ` <4AC267C7.4070300-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
@ 2009-09-29 22:11                                   ` Linus Torvalds
       [not found]                                     ` <alpine.LFD.2.01.0909291501530.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-09-29 22:11 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Pavel Emelyanov, Arnd Bergmann, linux-api-u79uwXL29TY76Z2rM5mHXA,
	Containers, Nathan Lynch, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman, kosaki.motohiro-+CUm20s59erQFUHtdCDX3A,
	mingo-X9Un+BFzKDI, Sukadev Bhattiprolu, Alexey Dobriyan,
	Roland McGrath, Arjan van de Ven



On Tue, 29 Sep 2009, H. Peter Anvin wrote:
> 
> That's not the main issue here, though.  The main issue is that the
> prototype of the function now depends on one of its arguments

Ok, I agree with that. The kernel side is easy (we have magic calling 
conventions there and need to turn registers into arguments anyway before 
you get to the shared code), but your point about the user side prototype 
is valid.

However, that could easily be handled by just having a extended_clone() 
prototype that then sets the CLONE_EXTINFO (or whatever) bit in the flags. 
I think most of the time the clone() stuff needs special user-level 
wrappers anyway to handle the stack setup etc, no?

In other words, what I'd suggest we could do is

 - the kernel "do_fork()" interface would be made to have the "extended" 
   format by default - so the _kernel_ never has two formats in its 
   generic logic.

 - the "sys_clone()" system call, that already needs to munge the user 
   mode registers into the "do_fork()" format, would be the one that 
   recognizes the new flag and copies the extended data from user mode 
   memory to the extended info mode.

Then each architecture would need to update it's "sys_clone()" function to 
take advantage of the new extended format, but that's something that the 
new system call would have had to do anyway, so that's not an added burden 
in any way.

Hmm?

I don't feel horribly strongly about this, and as far as I'm concerned 
it's fine to also do it as a new system call too (we already have 'fork()' 
and 'vfork()' as special case interfaces to do_fork() - the new 'extended 
clone' would be no different).

I just think that Roland is correct that if the new extended fork handles 
the "no new info" case itself _anyway_, then there is no upside to making 
it a new system call, since the complexity is the same as just extending 
the old one.

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                     ` <alpine.LFD.2.01.0909291501530.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-09-29 22:19                                       ` H. Peter Anvin
       [not found]                                         ` <4AC287F2.8060603-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
  2009-09-30  6:48                                       ` Roland McGrath
  1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2009-09-29 22:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arjan van de Ven, Roland McGrath, Sukadev Bhattiprolu,
	Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

On 09/29/2009 03:11 PM, Linus Torvalds wrote:
> 
> Ok, I agree with that. The kernel side is easy (we have magic calling 
> conventions there and need to turn registers into arguments anyway before 
> you get to the shared code), but your point about the user side prototype 
> is valid.
> 

I think it would also apply to kernel-side munging.  It's quite possibly
you're right in that clone is such a special case anyway, but it seems
pointless to make it more special in the short bus sort of way even if
it is possible.

Let's just make it another system call.  It doesn't have any downside
that I can see, might prevent problems, and avoids setting a bad
precedent that someone can misinterpret.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                     ` <alpine.LFD.2.01.0909291501530.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2009-09-29 22:19                                       ` H. Peter Anvin
@ 2009-09-30  6:48                                       ` Roland McGrath
  1 sibling, 0 replies; 15+ messages in thread
From: Roland McGrath @ 2009-09-30  6:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Arjan van de Ven, Sukadev Bhattiprolu,
	Arnd Bergmann, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

The glibc prototype for clone uses ... and is not a direct map to the
syscall args anyway.  So that would not change for adding more optional
args enabled by certain flags, as it did not change to add the tid pointer
arguments before.  But indeed the library function would have to change to
pass on additional or different args to the existing syscall.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                         ` <4AC287F2.8060603-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
@ 2009-09-30 16:15                                           ` Arnd Bergmann
       [not found]                                             ` <200909301815.45211.arnd-r2nGTMty4D4@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2009-09-30 16:15 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, Arjan van de Ven, Roland McGrath,
	Sukadev Bhattiprolu, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A

On Wednesday 30 September 2009, H. Peter Anvin wrote:
> Let's just make it another system call.  It doesn't have any downside
> that I can see, might prevent problems, and avoids setting a bad
> precedent that someone can misinterpret.

One more argument for this is that the new code is architecture independent
using user_stack_pointer(), while the original sys_clone is highly
architecture specific, which is a source for bugs when trying to
extend it.

	Arnd <><
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                             ` <200909301815.45211.arnd-r2nGTMty4D4@public.gmane.org>
@ 2009-09-30 16:27                                               ` Linus Torvalds
       [not found]                                                 ` <alpine.LFD.2.01.0909300925170.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-09-30 16:27 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Pavel Emelyanov, linux-api-u79uwXL29TY76Z2rM5mHXA, Containers,
	Nathan Lynch, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman, kosaki.motohiro-+CUm20s59erQFUHtdCDX3A,
	H. Peter Anvin, mingo-X9Un+BFzKDI, Sukadev Bhattiprolu,
	Alexey Dobriyan, Roland McGrath, Arjan van de Ven



On Wed, 30 Sep 2009, Arnd Bergmann wrote:
> 
> One more argument for this is that the new code is architecture independent
> using user_stack_pointer(), while the original sys_clone is highly
> architecture specific, which is a source for bugs when trying to
> extend it.

Umm. I don't think that is possible.

You need architecture-specific code to even get access to all registers to 
copy and get a signal-handler-compatible stack frame. See for example 
arch/alpha/kernel/entry.S with the switch-stack thing etc.  I don't think 
there is any way to make that even remotely architecture-neutral.

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                                 ` <alpine.LFD.2.01.0909300925170.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-09-30 17:59                                                   ` Arnd Bergmann
       [not found]                                                     ` <200909301959.41706.arnd-r2nGTMty4D4@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2009-09-30 17:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pavel Emelyanov, linux-api-u79uwXL29TY76Z2rM5mHXA, Containers,
	Nathan Lynch, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman, kosaki.motohiro-+CUm20s59erQFUHtdCDX3A,
	H. Peter Anvin, mingo-X9Un+BFzKDI, Sukadev Bhattiprolu,
	Alexey Dobriyan, Roland McGrath, Arjan van de Ven

On Wednesday 30 September 2009, Linus Torvalds wrote:
> Umm. I don't think that is possible.
> 
> You need architecture-specific code to even get access to all registers to 
> copy and get a signal-handler-compatible stack frame. See for example 
> arch/alpha/kernel/entry.S with the switch-stack thing etc.  I don't think 
> there is any way to make that even remotely architecture-neutral.

Right, you still need to save all the registers from the entry code.
I was under the wrong assumption that task_pt_regs(current)
would give the full register set on all architectures.

However, I'd still hope that a new system call can be defined in
a way that you only need to have an assembly wrapper to save
the full pt_regs, but no arch specific code to get the syscall arguments
out of that again. In do_clone(), you need a pointer to pt_regs and
the user stack pointer, but that can be generated from
user_stack_pointer(regs).

Does task_pt_regs(current) give the right pointer on all architectures
or do we also need to pass the regs into the syscall?

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC][v7][PATCH 8/9]: Define clone2() syscall
       [not found]                                                     ` <200909301959.41706.arnd-r2nGTMty4D4@public.gmane.org>
@ 2009-09-30 19:14                                                       ` Linus Torvalds
  0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2009-09-30 19:14 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, Arjan van de Ven, Roland McGrath,
	Sukadev Bhattiprolu, Containers, Nathan Lynch,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	mingo-X9Un+BFzKDI, Alexey Dobriyan, Pavel Emelyanov,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	kosaki.motohiro-+CUm20s59erQFUHtdCDX3A



On Wed, 30 Sep 2009, Arnd Bergmann wrote:
> 
> Right, you still need to save all the registers from the entry code.
> I was under the wrong assumption that task_pt_regs(current)
> would give the full register set on all architectures.
> 
> However, I'd still hope that a new system call can be defined in
> a way that you only need to have an assembly wrapper to save
> the full pt_regs, but no arch specific code to get the syscall arguments
> out of that again. In do_clone(), you need a pointer to pt_regs and
> the user stack pointer, but that can be generated from
> user_stack_pointer(regs).

I don't think it can. You don't know what the system call stack layout is. 

> Does task_pt_regs(current) give the right pointer on all architectures
> or do we also need to pass the regs into the syscall?

I do not believe that it gives the right pointer in general. In fact, I 
can guarantee it doesn't. Even on x86 it only works for certain contexts 
(non-vm86 mode at a minimum), and on architectures like alpha it's not at 
all sufficient, because even if you can locate the 'pt_regs' structure, 
you _also_ need the extra guarantees of the pt_regs being next to the 
extended signal state register structure - and that only happens for magic 
sequences like signal handling and explicit setups like fork/clone.

So I do repeat: if you think you can do all of this in generic code, then 
you're sadly and totally mistaken. Don't even try. It may work on some 
architectures, but it's simply fundamentally _wrong_.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-09-30 19:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090924165548.GA16586@us.ibm.com>
     [not found] ` <20090924170308.GH16989@us.ibm.com>
     [not found]   ` <200909242343.59903.arnd@arndb.de>
     [not found]     ` <20090925082346.GB4436@localdomain>
     [not found]       ` <20090925105632.GG12824@hawkmoon.kerlabs.com>
     [not found]         ` <20090925105632.GG12824-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-09-29 18:05           ` [RFC][v7][PATCH 8/9]: Define clone2() syscall Sukadev Bhattiprolu
     [not found]             ` <20090929180537.GD4625-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-29 18:40               ` Roland McGrath
     [not found]                 ` <20090929184023.532DF34-nL1rrgvulkc2UH6IwYuUx0EOCMrvLtNR@public.gmane.org>
2009-09-29 18:44                   ` H. Peter Anvin
     [not found]                     ` <4AC255A4.4030002-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2009-09-29 19:02                       ` Arjan van de Ven
     [not found]                         ` <20090929210207.247b94df-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2009-09-29 19:10                           ` Linus Torvalds
     [not found]                             ` <alpine.LFD.2.01.0909291207410.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-09-29 20:02                               ` H. Peter Anvin
     [not found]                                 ` <4AC267C7.4070300-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2009-09-29 22:11                                   ` Linus Torvalds
     [not found]                                     ` <alpine.LFD.2.01.0909291501530.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-09-29 22:19                                       ` H. Peter Anvin
     [not found]                                         ` <4AC287F2.8060603-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2009-09-30 16:15                                           ` Arnd Bergmann
     [not found]                                             ` <200909301815.45211.arnd-r2nGTMty4D4@public.gmane.org>
2009-09-30 16:27                                               ` Linus Torvalds
     [not found]                                                 ` <alpine.LFD.2.01.0909300925170.6996-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-09-30 17:59                                                   ` Arnd Bergmann
     [not found]                                                     ` <200909301959.41706.arnd-r2nGTMty4D4@public.gmane.org>
2009-09-30 19:14                                                       ` Linus Torvalds
2009-09-30  6:48                                       ` Roland McGrath
2009-09-29 20:00                         ` H. Peter Anvin
2009-09-29 21:58               ` Oren Laadan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).