From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mathieu Desnoyers Subject: Re: [RFC PATCH v9 for 4.15 01/14] Restartable sequences system call Date: Mon, 23 Oct 2017 20:44:01 +0000 (UTC) Message-ID: <439398759.47028.1508791441765.JavaMail.zimbra@efficios.com> References: <20171012230326.19984-1-mathieu.desnoyers@efficios.com> <20171012230326.19984-2-mathieu.desnoyers@efficios.com> <515879378.43966.1508350299712.JavaMail.zimbra@efficios.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Return-path: In-Reply-To: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Ben Maurer Cc: "Paul E. McKenney" , Boqun Feng , Peter Zijlstra , Paul Turner , Andrew Hunter , Andy Lutomirski , Dave Watson , Josh Triplett , Will Deacon , linux-kernel , Thomas Gleixner , Andi Kleen , Chris Lameter , Ingo Molnar , "H. Peter Anvin" , rostedt , Linus Torvalds , Andrew Morton , Russell King , Catalin Marinas , Michael List-Id: linux-api@vger.kernel.org ----- On Oct 23, 2017, at 7:30 PM, Ben Maurer bmaurer-b10kYP2dOMg@public.gmane.org wrote: >> if (!((long)ip - (long)start_ip <= (long)post_commit_offset)) >>   return 1; > >> This introduces an issue here: if "ip" is lower than "start_ip", we >> can incorrectly think we are in a critical section, when we are in >> fact not. > > This shouldn't be an issue if we used unsigned numbers. Eg if start_ip is X and > post_commit_offset is L, then (ip - X <= L) means that if ip is less than X ip > - X will be signed, which will become a large unsigned value. > >> or to the kernel to set it back to NULL if it finds out that it is >> preempting/delivering a signal over an instruction pointer outside >> of the current rseq_cs start_ip/post_commit_ip range (lazy clear). > > I see, lazy clear makes sense. Still, if during most execution periods the user > code enters some rseq section (likely if rseq is used for something like > malloc) on every context switch this code will have to be run. > >> Moreover, this modification would add a subtraction on the common case >> (ip - start_ip), and makes the ABI slightly uglier. > > We could benchmark it but the subtraction should be similar in cost to the extra > comparison but reducing the number of branches seems like it will help as well. > FWIW GCC attempts to translate this kind of sequence to a subtract and compare: > https://godbolt.org/g/5DGLvo. > > I agree the ABI is uglier, but since we're mucking with every context switch I > thought I'd point it out. Thanks for following up on this. I did not initially realize the importance of doing the unsigned comparison. I've pushed a commit in my private dev branch implementing your suggestion. https://github.com/compudj/linux-percpu-dev/commit/4cf8e9104636b51741c0118f2c88519e3acab7aa Thanks! Mathieu > >> If I understand well, you are proposing to speed up .so load time by >> means of removing relocations of pointers within rseq_cs, done by >> making those relative to the rseq_cs address. > > Yeah, I think this may be overkill as optimization. -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751683AbdJWUl6 convert rfc822-to-8bit (ORCPT ); Mon, 23 Oct 2017 16:41:58 -0400 Received: from mail.efficios.com ([167.114.142.141]:48423 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751315AbdJWUlx (ORCPT ); Mon, 23 Oct 2017 16:41:53 -0400 Date: Mon, 23 Oct 2017 20:44:01 +0000 (UTC) From: Mathieu Desnoyers To: Ben Maurer Cc: "Paul E. McKenney" , Boqun Feng , Peter Zijlstra , Paul Turner , Andrew Hunter , Andy Lutomirski , Dave Watson , Josh Triplett , Will Deacon , linux-kernel , Thomas Gleixner , Andi Kleen , Chris Lameter , Ingo Molnar , "H. Peter Anvin" , rostedt , Linus Torvalds , Andrew Morton , Russell King , Catalin Marinas , Michael Kerrisk , Alexander Viro , linux-api Message-ID: <439398759.47028.1508791441765.JavaMail.zimbra@efficios.com> In-Reply-To: References: <20171012230326.19984-1-mathieu.desnoyers@efficios.com> <20171012230326.19984-2-mathieu.desnoyers@efficios.com> <515879378.43966.1508350299712.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH v9 for 4.15 01/14] Restartable sequences system call MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Originating-IP: [167.114.142.141] X-Mailer: Zimbra 8.7.11_GA_1854 (ZimbraWebClient - FF52 (Linux)/8.7.11_GA_1854) Thread-Topic: Restartable sequences system call Thread-Index: AQHTQ6515cQ6nxXTN0eVL251OZT19KLkKPJz5kNtRjeZyh5CcJjYtzrI Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Oct 23, 2017, at 7:30 PM, Ben Maurer bmaurer@fb.com wrote: >> if (!((long)ip - (long)start_ip <= (long)post_commit_offset)) >>   return 1; > >> This introduces an issue here: if "ip" is lower than "start_ip", we >> can incorrectly think we are in a critical section, when we are in >> fact not. > > This shouldn't be an issue if we used unsigned numbers. Eg if start_ip is X and > post_commit_offset is L, then (ip - X <= L) means that if ip is less than X ip > - X will be signed, which will become a large unsigned value. > >> or to the kernel to set it back to NULL if it finds out that it is >> preempting/delivering a signal over an instruction pointer outside >> of the current rseq_cs start_ip/post_commit_ip range (lazy clear). > > I see, lazy clear makes sense. Still, if during most execution periods the user > code enters some rseq section (likely if rseq is used for something like > malloc) on every context switch this code will have to be run. > >> Moreover, this modification would add a subtraction on the common case >> (ip - start_ip), and makes the ABI slightly uglier. > > We could benchmark it but the subtraction should be similar in cost to the extra > comparison but reducing the number of branches seems like it will help as well. > FWIW GCC attempts to translate this kind of sequence to a subtract and compare: > https://godbolt.org/g/5DGLvo. > > I agree the ABI is uglier, but since we're mucking with every context switch I > thought I'd point it out. Thanks for following up on this. I did not initially realize the importance of doing the unsigned comparison. I've pushed a commit in my private dev branch implementing your suggestion. https://github.com/compudj/linux-percpu-dev/commit/4cf8e9104636b51741c0118f2c88519e3acab7aa Thanks! Mathieu > >> If I understand well, you are proposing to speed up .so load time by >> means of removing relocations of pointers within rseq_cs, done by >> making those relative to the rseq_cs address. > > Yeah, I think this may be overkill as optimization. -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com