From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: "linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>
Subject: Re: [PATCH] zlib: Optimize inffast when copying direct from output
Date: Tue, 24 Nov 2009 14:12:43 +1100 [thread overview]
Message-ID: <1259032363.16367.108.camel@pasglop> (raw)
In-Reply-To: <1257843644-8496-1-git-send-email-Joakim.Tjernlund@transmode.se>
On Tue, 2009-11-10 at 10:00 +0100, Joakim Tjernlund wrote:
> JFFS2 uses lesser compression ratio and inflate always
> ends up in "copy direct from output" case.
> This patch tries to optimize the direct copy procedure.
> Uses get_unaligned() but only in one place.
> The copy loop just above this one can also use this
> optimization, but I havn't done so as I have not tested if it
> is a win there too.
> On my MPC8321 this is about 17% faster on my JFFS2 root FS
> than the original.
> ---
>
> Would like some testing of the PowerPC boot wrapper and
> a LE target before sending it upstream.
Well, you should probably submit that patch to lkml then :-)
I'm not sure its going to work to use get_unaligned() like that on all
archs .. it might be definitely something to discuss on some more
appropriate mailing list.
Cheers,
Ben.
> arch/powerpc/boot/Makefile | 4 ++-
> lib/zlib_inflate/inffast.c | 48 +++++++++++++++++++++++++++++++++----------
> 2 files changed, 40 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> index 9ae7b7e..98e4c4f 100644
> --- a/arch/powerpc/boot/Makefile
> +++ b/arch/powerpc/boot/Makefile
> @@ -20,7 +20,7 @@
> all: $(obj)/zImage
>
> BOOTCFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> - -fno-strict-aliasing -Os -msoft-float -pipe \
> + -fno-strict-aliasing -Os -msoft-float -pipe -D__KERNEL__\
> -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
> -isystem $(shell $(CROSS32CC) -print-file-name=include)
> BOOTAFLAGS := -D__ASSEMBLY__ $(BOOTCFLAGS) -traditional -nostdinc
> @@ -34,6 +34,8 @@ BOOTCFLAGS += -fno-stack-protector
> endif
>
> BOOTCFLAGS += -I$(obj) -I$(srctree)/$(obj)
> +BOOTCFLAGS += -include include/linux/autoconf.h -Iarch/powerpc/include
> +BOOTCFLAGS += -Iinclude
>
> DTS_FLAGS ?= -p 1024
>
> diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c
> index 8550b0c..0c7fa3d 100644
> --- a/lib/zlib_inflate/inffast.c
> +++ b/lib/zlib_inflate/inffast.c
> @@ -4,6 +4,7 @@
> */
>
> #include <linux/zutil.h>
> +#include <asm/unaligned.h>
> #include "inftrees.h"
> #include "inflate.h"
> #include "inffast.h"
> @@ -24,9 +25,11 @@
> #ifdef POSTINC
> # define OFF 0
> # define PUP(a) *(a)++
> +# define UP_UNALIGNED(a) get_unaligned((a)++)
> #else
> # define OFF 1
> # define PUP(a) *++(a)
> +# define UP_UNALIGNED(a) get_unaligned(++(a))
> #endif
>
> /*
> @@ -239,18 +242,41 @@ void inflate_fast(z_streamp strm, unsigned start)
> }
> }
> else {
> + unsigned short *sout;
> + unsigned long loops;
> +
> from = out - dist; /* copy direct from output */
> - do { /* minimum length is three */
> - PUP(out) = PUP(from);
> - PUP(out) = PUP(from);
> - PUP(out) = PUP(from);
> - len -= 3;
> - } while (len > 2);
> - if (len) {
> - PUP(out) = PUP(from);
> - if (len > 1)
> - PUP(out) = PUP(from);
> - }
> + /* minimum length is three */
> + /* Align out addr */
> + if (!((long)(out - 1 + OFF)) & 1) {
> + PUP(out) = PUP(from);
> + len--;
> + }
> + sout = (unsigned short *)(out - OFF);
> + if (dist > 2 ) {
> + unsigned short *sfrom;
> +
> + sfrom = (unsigned short *)(from - OFF);
> + loops = len >> 1;
> + do
> + PUP(sout) = UP_UNALIGNED(sfrom);
> + while (--loops);
> + out = (unsigned char *)sout + OFF;
> + from = (unsigned char *)sfrom + OFF;
> + } else { /* dist == 1 or dist == 2 */
> + unsigned short pat16;
> +
> + pat16 = *(sout-2+2*OFF);
> + if (dist == 1)
> + pat16 = (pat16 & 0xff) | ((pat16 & 0xff ) << 8);
> + loops = len >> 1;
> + do
> + PUP(sout) = pat16;
> + while (--loops);
> + out = (unsigned char *)sout + OFF;
> + }
> + if (len & 1)
> + PUP(out) = PUP(from);
> }
> }
> else if ((op & 64) == 0) { /* 2nd level distance code */
next prev parent reply other threads:[~2009-11-24 3:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Received>
2009-11-10 9:00 ` [PATCH] zlib: Optimize inffast when copying direct from output Joakim Tjernlund
2009-11-24 3:12 ` Benjamin Herrenschmidt [this message]
2009-11-26 8:30 ` Joakim Tjernlund
2009-11-26 8:46 ` Benjamin Herrenschmidt
2009-11-26 9:02 ` Joakim Tjernlund
2009-11-12 9:04 Joakim Tjernlund
-- strict thread matches above, loose matches on Subject: below --
2009-11-10 9:03 Joakim Tjernlund
2009-11-08 9:53 Joakim Tjernlund
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1259032363.16367.108.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=Joakim.Tjernlund@transmode.se \
--cc=linuxppc-dev@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.