public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Rodgman <dave.rodgman@arm.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Matt Sealey <Matt.Sealey@arm.com>,
	Dave Rodgman <dave.rodgman@arm.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
	"markus@oberhumer.com" <markus@oberhumer.com>,
	"minchan@kernel.org" <minchan@kernel.org>,
	"nitingupta910@gmail.com" <nitingupta910@gmail.com>,
	"rpurdie@openedhand.com" <rpurdie@openedhand.com>,
	"sergey.senozhatsky.work@gmail.com" 
	<sergey.senozhatsky.work@gmail.com>,
	"sonnyrao@google.com" <sonnyrao@google.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"sfr@canb.auug.org.au" <sfr@canb.auug.org.au>
Cc: nd <nd@arm.com>
Subject: [PATCH v5 0/3]: lib/lzo: run-length encoding support
Date: Tue, 5 Feb 2019 15:59:59 +0000	[thread overview]
Message-ID: <20190205155944.16007-1-dave.rodgman@arm.com> (raw)

Hi,

Following on from the previous lzo-rle patchset:

https://lkml.org/lkml/2018/11/30/972

This patchset contains only the RLE patches, and should be applied on top of
the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ).


Previously, some questions were raised around the RLE patches. I've done some
additional benchmarking to answer these questions. In short:

 - RLE offers significant additional performance (data-dependent)
 - I didn't measure any regressions that were clearly outside the noise


One concern with this patchset was around performance - specifically, measuring
RLE impact separately from Matt Sealey's patches (CTZ & fast copy). I have done
some additional benchmarking which I hope clarifies the benefits of each part
of the patchset.

Firstly, I've captured some memory via /dev/fmem from a Chromebook with many
tabs open which is starting to swap, and then split this into 4178 4k pages.
I've excluded the all-zero pages (as zram does), and also the no-zero pages
(which won't tell us anything about RLE performance). This should give a
realistic test dataset for zram. What I found was that the data is VERY
bimodal: 44% of pages in this dataset contain 5% or fewer zeros, and 44%
contain over 90% zeros (30% if you include the no-zero pages). This supports
the idea of special-casing zeros in zram.

Next, I've benchmarked four variants of lzo on these pages (on 64-bit Arm at
max frequency): baseline LZO; baseline + Matt Sealey's patches (aka MS);
baseline + RLE only; baseline + MS + RLE. Numbers are for weighted roundtrip
throughput (the weighting reflects that zram does more compression than
decompression).

https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing

Matt's patches help in all cases for Arm (and no effect on Intel), as expected.

RLE also behaves as expected: with few zeros present, it makes no difference;
above ~75%, it gives a good improvement (50 - 300 MB/s on top of the benefit
from Matt's patches).

Best performance is seen with both MS and RLE patches.

Finally, I have benchmarked the same dataset on an x86-64 device. Here, the
MS patches make no difference (as expected); RLE helps, similarly as on Arm.
There were no definite regressions; allowing for observational error, 0.1%
(3/4178) of cases had a regression > 1 standard deviation, of which the largest
was 4.6% (1.2 standard deviations). I think this is probably within the noise.

https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing

One point to note is that the graphs show RLE appears to help very slightly
with no zeros present! This is because the extra code causes the clang
optimiser to change code layout in a way that happens to have a significant
benefit. Taking baseline LZO and adding a do-nothing line like
"__builtin_prefetch(out_len);" immediately before the "goto next" has the same
effect. So this is a real, but basically spurious effect - it's small enough
not to upset the overall findings.

Dave



             reply	other threads:[~2019-02-05 16:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-05 15:59 Dave Rodgman [this message]
2019-02-05 16:00 ` [PATCH v5 1/3] lib/lzo: implement run-length encoding Dave Rodgman
2019-02-05 16:00 ` [PATCH v5 2/3] lib/lzo: separate lzo-rle from lzo Dave Rodgman
2019-02-05 16:00 ` [PATCH v5 3/3] zram: default to lzo-rle instead of lzo Dave Rodgman
2024-03-08  3:25 ` [PATCH v5 0/3]: lib/lzo: run-length encoding support Tao Liu
     [not found]   ` <AS8PR08MB102898FB26D627E790FE8D4638F272@AS8PR08MB10289.eurprd08.prod.outlook.com>
2024-03-12  8:28     ` Tao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190205155944.16007-1-dave.rodgman@arm.com \
    --to=dave.rodgman@arm.com \
    --cc=Matt.Sealey@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@oberhumer.com \
    --cc=minchan@kernel.org \
    --cc=nd@arm.com \
    --cc=nitingupta910@gmail.com \
    --cc=rpurdie@openedhand.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sfr@canb.auug.org.au \
    --cc=sonnyrao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox