Re: [PATCH 0/6] lib/lzo: performance improvements

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
To: Dave Rodgman <dave.rodgman@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: nd <nd@arm.com>,
	"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
	"davem@davemloft.net" <davem@davemloft.net>,
	Matt Sealey <Matt.Sealey@arm.com>,
	"nitingupta910@gmail.com" <nitingupta910@gmail.com>,
	"rpurdie@openedhand.com" <rpurdie@openedhand.com>,
	"minchan@kernel.org" <minchan@kernel.org>,
	"sergey.senozhatsky.work@gmail.com" 
	<sergey.senozhatsky.work@gmail.com>,
	Sonny Rao <sonnyrao@google.com>
Subject: Re: [PATCH 0/6] lib/lzo: performance improvements
Date: Wed, 21 Nov 2018 14:44:49 +0100	[thread overview]
Message-ID: <5BF56151.5090201@oberhumer.com> (raw)
In-Reply-To: <VI1PR0802MB25289788BFADB136CE08AADA8FDA0@VI1PR0802MB2528.eurprd08.prod.outlook.com>

Hi Dave,

thanks for your patch set. Just some initial comments:


I think the three patches

  [PATCH 2/6] lib/lzo: enable 64-bit CTZ on Arm
  [PATCH 3/6] lib/lzo: 64-bit CTZ on Arm aarch64
  [PATCH 4/6] lib/lzo: fast 8-byte copy on arm64

should be applied in any case - could you please make an extra
pull request out of these and try to get them merged as fast
as possible. Thanks.


The first patch

  [PATCH 1/6] lib/lzo: clean-up by introducing COPY16

does not really affect the resulting code at the moment, but please
note that in one case the actual copy unit is not allowed to
be greater 8 bytes (which might be implied by the name "COPY16").
So this needs more work like an extra COPY16_BY_8() macro.


As for your your "lzo-rle" improvements I'll have a look.

Please note that the first byte value 17 is actually valid when using
external dictionaries ("lzo1x_decompress_dict_safe()" in the LZO source
code). While this functionality is not present in the Linux kernel at
the moment it might be worrisome wrt future enhancements.

Finally I'm wondering if your chart comparisions just compares the "lzo-rle"
patch or also includes the ARM64 improvments - I cannot understand where a
20% speedup should come from if you have 0% zeros.

Cheers,
Markus



On 2018-11-21 13:06, Dave Rodgman wrote:
> This patch series introduces performance improvements for lzo.
> 
> The improvements fall into two categories: general Arm-specific optimisations
> (e.g., more efficient memory access); and the introduction of a special case 
> for handling runs of zeros (which is a common case for zram) using run-length
> encoding.
> 
> The introduction of RLE modifies the bitstream such that it can't be decoded
> by old versions of lzo (the new lzo-rle can correctly decode old bitstreams).
> To avoid possible issues where data is persisted on disk (e.g., squashfs), the 
> final patch in this series separates lzo-rle into a separate algorithm
> alongside lzo, so that the new lzo-rle is (by default) only used for zram and
> must be explicitly selected for other use-cases. This final patch could be
> omitted if the consensus is that we'd rather avoid proliferation of lzo
> variants.
> 
> Overall, performance is improved by around 1.1 - 4.8x (data-dependent: data
> with many zero runs shows higher improvement). Under real-world testing with
> zram, time spent in (de)compression during swapping is reduced by around 27%.
> The graph below shows the weighted round-trip throughput of lzo, lz4 and
> lzo-rle, for randomly generated 4k chunks of data with varying levels of
> entropy. (To calculate weighted round-trip throughput, compression performance
> is emphasised to reflect the fact that zram does around 2.25x more compression
> than decompression. (Results and overall trends are fairly similar for
> unweighted).
> 
> https://drive.google.com/file/d/18GU4pgRVCLNN7wXxynz-8R2ygrY2IdyE/view
> 
> Contributors:
> Dave Rodgman <dave.rodgman@arm.com>
> Matt Sealey <matt.sealey@arm.com>
> 

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/

next prev parent reply	other threads:[~2018-11-21 13:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-21 12:06 [PATCH 0/6] lib/lzo: performance improvements Dave Rodgman
2018-11-21 13:44 ` Markus F.X.J. Oberhumer [this message]
2018-11-21 16:55   ` Dave Rodgman
2018-11-21 17:04   ` Matt Sealey
2018-11-23  2:12 ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5BF56151.5090201@oberhumer.com \
    --to=markus@oberhumer.com \
    --cc=Matt.Sealey@arm.com \
    --cc=dave.rodgman@arm.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=nd@arm.com \
    --cc=nitingupta910@gmail.com \
    --cc=rpurdie@openedhand.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sonnyrao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox