From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [PATCH] PPC assembly implementation of SHA1 Date: Sun, 24 Apr 2005 12:49:28 +1000 Message-ID: <1114310968.5444.24.camel@gaston> References: <20050423124246.30071.qmail@science.horizon.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: Paul Mackerras , git@vger.kernel.org X-From: git-owner@vger.kernel.org Sun Apr 24 04:40:39 2005 Return-path: Received: from vger.kernel.org ([12.107.209.244]) by ciao.gmane.org with esmtp (Exim 4.43) id 1DPX2s-0000RZ-3Y for gcvg-git@gmane.org; Sun, 24 Apr 2005 04:40:30 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262234AbVDXCpN (ORCPT ); Sat, 23 Apr 2005 22:45:13 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262235AbVDXCpN (ORCPT ); Sat, 23 Apr 2005 22:45:13 -0400 Received: from gate.crashing.org ([63.228.1.57]:10161 "EHLO gate.crashing.org") by vger.kernel.org with ESMTP id S262234AbVDXCpH (ORCPT ); Sat, 23 Apr 2005 22:45:07 -0400 Received: from gaston (localhost [127.0.0.1]) by gate.crashing.org (8.12.8/8.12.8) with ESMTP id j3O2dngJ000404; Sat, 23 Apr 2005 21:39:52 -0500 To: linux@horizon.com In-Reply-To: <20050423124246.30071.qmail@science.horizon.com> X-Mailer: Evolution 2.0.4 Sender: git-owner@vger.kernel.org Precedence: bulk X-Mailing-List: git@vger.kernel.org On Sat, 2005-04-23 at 12:42 +0000, linux@horizon.com wrote: > - You don't need to decrement %r1 before saving registers. > The PPC calling convention defines a "red zone" below the > current stack pointer that is guaranteed never to be touched > by signal handlers or the like. This is specifically for > leaf procedure optimization, and is at least 224 bytes. On ELF ppc32 ABI ? Hrm... the SysV ABI document says "The only permitted references with negative offsets from the stack pointer are those described here for establishing a stack frame." I know MacOS had a red zone, but do we ? > - Is that many stw/lwz instructions faster than stmw/lmw? > The latter is at least more cahce-friendly. More cache friendly ? Hrm.. I wouldn't be sure. Also, I remember readind a while ago that those store/load multiple instructions aren't that recommended. They aren't very good on some CPU models. I would expect the bus/cache bandwidth to be the limiting factor here anyway, and as you point out below, the they don't deal with unaligned access very well in all cases.