From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: Simplfying copy_siginfo_to_user Date: Mon, 24 Jul 2017 14:01:59 -0500 Message-ID: <87r2x54q1k.fsf@xmission.com> References: <87o9shg7t7.fsf_-_@xmission.com> <20170718140651.15973-7-ebiederm@xmission.com> <878tjlbqpt.fsf@xmission.com> <8760ek5ics.fsf_-_@xmission.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: (Linus Torvalds's message of "Mon, 24 Jul 2017 10:43:34 -0700") List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Linus Torvalds Cc: "linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Andrei Vagin , Greg KH , Linux Containers , Pavel Emelyanov , Oleg Nesterov , Linux Kernel Mailing List , Al Viro , Andy Lutomirski , Linux API , Cyrill Gorcunov , Michael Kerrisk , Thomas Gleixner , Willy Tarreau , Andrey Vagin List-Id: containers.vger.kernel.org Linus Torvalds writes: > On Sat, Jul 22, 2017 at 1:25 PM, Eric W. Biederman > wrote: >> I played with some clever changes such as limiting the copy to 48 bytes, >> disabling the memset and the like but I could not get a strong enough >> signal to say that any one change removed the extra or a clear part of >> it 20ns. > > What CPU did you use? Because the SMAP bit in particular matters. > > The field-by-field copies are extremely slow on modern CPU's that > implement SMAP, unless you also use the special "unsafe_put_user()" > code (or the nasty old put_user_ex() code that some of the x86 signal > code uses). > > So one of the advantages of just copy_to_user() ends up being visible > only on Broadwell+ (or whatever the SMAP cutoff is). Good point. The cpu I was testing on was an AMD A10. I don't actually have a cpu that supports SMAP handy. If you would like I can post the minimal patches and benckmark so anyone who is interested could reproduce this for themselves. I suspect that if it is down to only 20ns without SMAP this will definitely be a performance improvement in the presence of SMAP. Eric