linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joonsoo Kim <js1304@gmail.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Alexander Potapenko <glider@google.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	kernel-team@lge.com
Subject: Re: [PATCH v1 00/11] mm/kasan: support per-page shadow memory to reduce memory consumption
Date: Wed, 17 May 2017 16:23:17 +0900	[thread overview]
Message-ID: <20170517072315.GA18406@js1304-desktop> (raw)
In-Reply-To: <CACT4Y+anOw8=7u-pZ2ceMw0xVnuaO9YKBJAr-2=KOYt_72b2pw@mail.gmail.com>

On Tue, May 16, 2017 at 01:49:10PM -0700, Dmitry Vyukov wrote:
> > Anyway, I have missed inline instrumentation completely.
> >
> > I will attach the fix in the bottom. It doesn't look beautiful
> > since it breaks layer design (some check will be done at report
> > function). However, I think that it's a good trade-off.
> 
> 
> I can confirm that inline works with that patch.

Thanks for confirming!

> 
> I can also confirm that it reduces memory usage. I've booted qemu with
> 2G ram and run some fixed workload. Before:
> 31853 dvyukov   20   0 3043200 765464  21312 S 366.0  4.7   2:39.53
> qemu-system-x86
>  7528 dvyukov   20   0 3043200 732444  21676 S 333.3  4.5   2:23.19
> qemu-system-x86
> After:
> 6192 dvyukov   20   0 3043200 394244  20636 S  17.9  2.4   2:32.95
> qemu-system-x86
>  6265 dvyukov   20   0 3043200 388860  21416 S 399.3  2.4   3:02.88
> qemu-system-x86
>  9005 dvyukov   20   0 3043200 383564  21220 S 397.1  2.3   2:35.33
> qemu-system-x86
> 
> However, I see some very significant slowdowns with inline
> instrumentation. I did 3 tests:
> 1. Boot speed, I measured time for a particular message to appear on
> console. Before:
> [    2.504652] random: crng init done
> [    2.435861] random: crng init done
> [    2.537135] random: crng init done
> After:
> [    7.263402] random: crng init done
> [    7.263402] random: crng init done
> [    7.174395] random: crng init done
> 
> That's ~3x slowdown.
> 
> 2. I've run bench_readv benchmark:
> https://raw.githubusercontent.com/google/sanitizers/master/address-sanitizer/kernel_buildbot/slave/bench_readv.c
> as:
> while true; do time ./bench_readv bench_readv 300000 1; done
> 
> Before:
> sys 0m7.299s
> sys 0m7.218s
> sys 0m6.973s
> sys 0m6.892s
> sys 0m7.035s
> sys 0m6.982s
> sys 0m6.921s
> sys 0m6.940s
> sys 0m6.905s
> sys 0m7.006s
> 
> After:
> sys 0m8.141s
> sys 0m8.077s
> sys 0m8.067s
> sys 0m8.116s
> sys 0m8.128s
> sys 0m8.115s
> sys 0m8.108s
> sys 0m8.326s
> sys 0m8.529s
> sys 0m8.164s
> sys 0m8.380s
> 
> This is ~19% slowdown.
> 
> 3. I've run bench_pipes benchmark:
> https://raw.githubusercontent.com/google/sanitizers/master/address-sanitizer/kernel_buildbot/slave/bench_pipes.c
> as:
> while true; do time ./bench_pipes 10 10000 1; done
> 
> Before:
> sys 0m5.393s
> sys 0m6.178s
> sys 0m5.909s
> sys 0m6.024s
> sys 0m5.874s
> sys 0m5.737s
> sys 0m5.826s
> sys 0m5.664s
> sys 0m5.758s
> sys 0m5.421s
> sys 0m5.444s
> sys 0m5.479s
> sys 0m5.461s
> sys 0m5.417s
> 
> After:
> sys 0m8.718s
> sys 0m8.281s
> sys 0m8.268s
> sys 0m8.334s
> sys 0m8.246s
> sys 0m8.267s
> sys 0m8.265s
> sys 0m8.437s
> sys 0m8.228s
> sys 0m8.312s
> sys 0m8.556s
> sys 0m8.680s
> 
> This is ~52% slowdown.
> 
> 
> This does not look acceptable to me. I would ready to pay for this,
> say, 10% of performance. But it seems that this can have up to 2-4x
> slowdown for some workloads.

I found the reasons of above regression. There are two reasons.

1. In my implementation, original shadow to the memory allocated from
memblock is black shadow so it causes to call kasan_report(). It will
pass the check since per page shadow would be zero shadow but it
causes some overhead.

2. Memory used by stackdepot is in a similar situation with #1. It
allocates page and divide it to many objects. Then, use it like as
object. Although there is "KASAN_SANITIZE_stackdepot.o := n" which try
to disable sanitizer, there is a function call (memcmp() in
find_stack()) to other file and sanitizer work for it.

#1 problem can be fixed but more investigation is needed. I will
respin the series after fixing it.

#2 problem also can be fixed. There are two options here. First, uses
private memcmp() for stackdepot and disable sanitizer for it. I think
that this is a right approach since it slowdown the performance in all
KASAN build cases. And, we don't want to sanitize KASAN itself.
Second, I can provide a function to map the actual shadow manually. It
will reduce the case calling kasan_report().

See the attached patch. It implements later approach on #2 problem.
It would reduce performance regression. I have tested your bench_pipes
test with it and found that performance is restored. However, there is
still remaining problem, #1, so I'm not sure that it completely
restore your regression. Could you check that if possible?

Anyway, I think that respin is needed to fix this performance problem
completely.

> 
> 
> Your use-case is embed devices where you care a lot about both code
> size and memory consumption, right?

Yes.

> I see 2 possible ways forward:
> 1. Enable this new mode only for outline, but keep current scheme for
> inline. Then outline will be "small but slow" type of configuration.

Performance problem is not that bad in OUTLINE build. Therefore,
this is a reasonable option to have.

> 2. Somehow fix slowness (at least in inline mode).

I will try to fix slowness as much as possible. If slowness cannot be
acceptable after such effort, we can choose the direction at that
moment.

> 
> > Mapping zero page to non-kernel memory could cause true-negative
> > problem since we cannot flush the TLB in all cpus. We will read zero
> > shadow value value in this case even if actual shadow value is not
> > zero. This is one of the reason that black page is introduced in this
> > patchset.
> 
> What does make your current patch work then?
> Say we map a new shadow page, update the page shadow to say that there
> is mapped shadow. Then another CPU loads the page shadow and then
> loads from the newly mapped shadow. If we don't flush TLB, what makes
> the second CPU see the newly mapped shadow?

There is a fix-up processing to see the newly mapped shadow in other
cpus. check_memory_region_slow() exists for that purpose. In stale TLB
case, we will see black shadow and fall in to this function. In this
function, we flush stale TLB and re-check so we can see correct
result.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-17  7:23 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-16  1:16 [PATCH v1 00/11] mm/kasan: support per-page shadow memory to reduce memory consumption js1304
2017-05-16  1:16 ` [PATCH v1 01/11] mm/kasan: rename XXX_is_zero to XXX_is_nonzero js1304
2017-05-16  1:16 ` [PATCH v1 02/11] mm/kasan: don't fetch the next shadow value speculartively js1304
2017-05-16  1:16 ` [PATCH v1 03/11] mm/kasan: handle unaligned end address in zero_pte_populate js1304
2017-05-16  1:16 ` [PATCH v1 04/11] mm/kasan: extend kasan_populate_zero_shadow() js1304
2017-05-16  1:16 ` [PATCH v1 05/11] mm/kasan: introduce per-page shadow memory infrastructure js1304
2017-05-16  1:16 ` [PATCH v1 06/11] mm/kasan: mark/unmark the target range that is for original shadow memory js1304
2017-05-16  1:16 ` [PATCH v1 07/11] x86/kasan: use per-page " js1304
2017-05-16  1:16 ` [PATCH v1 08/11] mm/kasan: support on-demand shadow allocation/mapping js1304
2017-05-16  1:16 ` [PATCH v1 09/11] x86/kasan: support on-demand shadow mapping js1304
2017-05-16  1:16 ` [PATCH v1 10/11] mm/kasan: support dynamic shadow memory free js1304
2017-05-16  1:16 ` [PATCH v1 11/11] mm/kasan: change the order of shadow memory check js1304
2017-05-16  1:28 ` [PATCH(RE-RESEND) v1 01/11] mm/kasan: rename _is_zero to _is_nonzero Joonsoo Kim
2017-05-16  4:34 ` [PATCH v1 00/11] mm/kasan: support per-page shadow memory to reduce memory consumption Dmitry Vyukov
2017-05-16  4:47   ` Dmitry Vyukov
2017-05-16  6:23   ` Joonsoo Kim
2017-05-16 20:49     ` Dmitry Vyukov
2017-05-17  7:23       ` Joonsoo Kim [this message]
2017-05-17  7:25         ` Joonsoo Kim
2017-05-24  6:57       ` Dmitry Vyukov
2017-05-24  7:45         ` Joonsoo Kim
2017-05-24 17:19           ` Dmitry Vyukov
2017-05-25  0:41             ` Joonsoo Kim
2017-05-29 15:07               ` Dmitry Vyukov
2017-05-29 15:12                 ` Dmitry Vyukov
2017-05-29 15:29                   ` Dmitry Vyukov
2017-05-30  7:58                     ` Vladimir Murzin
2017-05-30  8:15                       ` Dmitry Vyukov
2017-05-30  8:31                         ` Vladimir Murzin
2017-05-30  8:40                           ` Vladimir Murzin
2017-05-30  8:49                             ` Dmitry Vyukov
2017-05-30  9:08                               ` Vladimir Murzin
2017-05-30  9:26                                 ` Dmitry Vyukov
2017-05-30  9:39                                   ` Vladimir Murzin
2017-05-30  9:45                                     ` Dmitry Vyukov
2017-05-30  9:54                                       ` Vladimir Murzin
2017-05-30 14:16                     ` Andrey Ryabinin
2017-05-31  5:50                       ` Joonsoo Kim
2017-05-31 16:31                         ` Andrey Ryabinin
2017-06-08  2:43                           ` Joonsoo Kim
2017-06-01 15:16                       ` 王靖天
2017-06-01 18:06                       ` Dmitry Vyukov
2017-06-08  2:40                         ` Joonsoo Kim
2017-06-13 16:49                           ` Andrey Ryabinin
2017-06-14  0:12                             ` Joonsoo Kim
2017-05-17 12:17 ` Andrey Ryabinin
2017-05-19  1:53   ` Joonsoo Kim
2017-05-22  6:02     ` Dmitry Vyukov
2017-05-24  6:04       ` Joonsoo Kim
2017-05-24 16:31         ` Dmitry Vyukov
2017-05-25  0:46           ` Joonsoo Kim
2017-05-22 14:00     ` Andrey Ryabinin
2017-05-24  6:18       ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170517072315.GA18406@js1304-desktop \
    --to=js1304@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hpa@zytor.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).