From mboxrd@z Thu Jan 1 00:00:00 1970 From: Florian Weimer Subject: Re: For review: documentation of clone3() system call Date: Mon, 11 Nov 2019 16:20:36 +0100 Message-ID: <875zjqwibf.fsf@oldenburg2.str.redhat.com> References: <87tv7awj4g.fsf@oldenburg2.str.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: (Jann Horn's message of "Mon, 11 Nov 2019 16:15:58 +0100") Sender: linux-kernel-owner@vger.kernel.org To: Jann Horn Cc: "Michael Kerrisk (man-pages)" , Christian Brauner , lkml , linux-man , Kees Cook , Oleg Nesterov , Arnd Bergmann , David Howells , Pavel Emelyanov , Andrew Morton , Adrian Reber , Andrei Vagin , Linux API List-Id: linux-api@vger.kernel.org * Jann Horn: > On Mon, Nov 11, 2019 at 4:03 PM Florian Weimer wrote= : >> >> * Michael Kerrisk: >> >> > Another difference for the raw clone() system call is that = the >> > stack argument may be NULL, in which case the child uses a dup= li=E2=80=90 >> > cate of the parent's stack. (Copy-on-write semantics ensure t= hat >> > the child gets separate copies of stack pages when either proc= ess >> > modifies the stack.) In this case, for correct operation, = the >> > CLONE_VM option should not be specified. (If the child shares = the >> > parent's memory because of the use of the CLONE_VM flag, then= no >> > copy-on-write duplication occurs and chaos is likely to result.= ) >> >> I think sharing the stack also works with CLONE_VFORK with CLONE_VM, as >> long as measures are taken to preserve the return address in a register. > > That basically just requires that the userspace function declaration > for clone3 includes __attribute__((returns_twice)), right? The clone3 implementation itself would have to store the return address in a register because at the time of the second return, a return address on the stack may have been corrupted by the subprocess because what used to be the stack frame of the clone function has since been reused for something else. __attribute__ ((returns_twice)) is likely needed as well, but that benefits the caller. It's also not clear that it is sufficient for this to work in all cases. (But the mandatory-to-implement vfork function faces the same problems.) Thanks, Florian