From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F006CC636CA for ; Sun, 18 Jul 2021 21:41:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D095A61026 for ; Sun, 18 Jul 2021 21:41:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233039AbhGRVnB (ORCPT ); Sun, 18 Jul 2021 17:43:01 -0400 Received: from cynthia.allandria.com ([50.242.82.17]:39306 "EHLO cynthia.allandria.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229585AbhGRVnB (ORCPT ); Sun, 18 Jul 2021 17:43:01 -0400 X-Greylist: delayed 2400 seconds by postgrey-1.27 at vger.kernel.org; Sun, 18 Jul 2021 17:43:01 EDT Received: from flar by cynthia.allandria.com with local (Exim 4.84_2) (envelope-from ) id 1m5DtE-0000MP-HI; Sun, 18 Jul 2021 13:59:56 -0700 Date: Sun, 18 Jul 2021 13:59:56 -0700 From: Brad Boyer To: Michael Schmitz Cc: Andreas Schwab , "Eric W. Biederman" , geert@linux-m68k.org, linux-arch@vger.kernel.org, linux-m68k@lists.linux-m68k.org, torvalds@linux-foundation.org Subject: Re: [PATCH v4 0/3] m68k: Improved switch stack handling Message-ID: <20210718205956.GA802@allandria.com> References: <1624407696-20180-1-git-send-email-schmitzmic@gmail.com> <87zgunzovm.fsf@disp2133> <3b4f287b-7be2-0e7b-ae5a-6c11972601fb@gmail.com> <1b656c02-925c-c4ba-03d3-f56075cdfac5@gmail.com> <8735scvklk.fsf@disp2133> <87a6mj99vf.fsf@igel.home> <1ebfb9de-de16-d05c-ea15-a110857fe284@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1ebfb9de-de16-d05c-ea15-a110857fe284@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Precedence: bulk List-ID: X-Mailing-List: linux-arch@vger.kernel.org On Mon, Jul 19, 2021 at 07:47:19AM +1200, Michael Schmitz wrote: > Somewhere in entry.S is > > addql #8,%sp > addql #4,%sp > > - is that faster than > > lea 12(%sp),%sp ? On the 68040 the timing can depend on the other instructions around it. Each of those addql instructions is listed as 1 and 1 for fetch/execute, while that lea is listed as 2 and 1L+1 meaning that it could potentially be faster depending on the behavior of the instruction that preceded it thorough the execute stage. That one free cycle if the stage is busy (due to the 1L) could make it effectively faster since the first addql would have to wait that extra cycle in that case. On the 68060, it looks like the lea version is the clear winner, although the timing description is obviously much more complicated and thus I might have missed something. From a quick look, it seems that lea takes the same time as just the first addql. On CPU32, the lea version loses due to the extra 3 cycles from the addressing mode, even though the base cycles of lea are the same as for addql (2 cycles each). The lea might be even worse if it can't take advantage of overlapping the surrounding instructions (1 cycle before and 1 after). Those are the only ones I already have the documentation in my hands. I haven't checked older classic cores or coldfire, but it does seem like it is specific to each chip which is faster. Obviously both versions would be the same size (2 words). Brad Boyer flar@allandria.com