From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262181AbTIQQ5i (ORCPT ); Wed, 17 Sep 2003 12:57:38 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262594AbTIQQ5i (ORCPT ); Wed, 17 Sep 2003 12:57:38 -0400 Received: from dbl.q-ag.de ([80.146.160.66]:16768 "EHLO dbl.q-ag.de") by vger.kernel.org with ESMTP id S262181AbTIQQ5h (ORCPT ); Wed, 17 Sep 2003 12:57:37 -0400 Message-ID: <3F689270.7050606@colorfullife.com> Date: Wed, 17 Sep 2003 18:57:20 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030701 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Pavel Machek CC: Jeff Garzik , linux-kernel@vger.kernel.org Subject: Re: [PATCH] add likely around access_ok for uaccess References: <3F644E36.5010402@colorfullife.com> <20030916194929.GF602@elf.ucw.cz> <20030916205545.GA14153@gtf.org> <20030916205931.GF1205@elf.ucw.cz> In-Reply-To: <20030916205931.GF1205@elf.ucw.cz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Pavel Machek wrote: >Hi! > > > >>>>while trying to figure out why sysv msg is around 30% slower than pipes >>>>for data transfers I noticed that gcc's autodetection (3.2.2) guesses >>>>the "if(access_ok())" tests in uaccess.h wrong and puts the error memset >>>>into the direct path and the copy out of line. >>>> >>>>The attached patch adds likely to the tests - any objections? What about >>>>the archs except i386? >>>> >>>> >>>How much speedup did you gain? >>> >>> >>How much can it hurt? >> >> > >The change is obviously okay, I just wanted to know... If it gains 30% >on sysv messages.. that would be pretty big surprise. > No. I didn't benchmark it at all. I'd expect one or two cycles. The 30% difference are due to misaligned buffers: The sysvmsg ABI uses struct msgbuf { unsigned long type; char data[]; }; msgbuf was aligned, thus data on an not 8-byte aligned address. Thus the "rep;movsl" microcode fastpath didn't kick in, and that caused the 30-50% slowdown. After manually misaligning the msgbuf structure, and properly aligning the kernel buffers, the performance of sysvmsg is now identical to pipes: Around 4 k cycles for a one-byte ping-pong, and around 10k cycles for 4 kB ping-pong, with a Celeron mobile 1.13 GHz. I haven't decided yet if a patch to align the kernel buffers is a good thing or not - it only helps my benchmark, I'm not aware of a real-world app that uses sysvmsg for bulk data transfers. -- Manfred