From: Minchan Kim <minchan@kernel.org>
To: Daniel Micay <danielmicay@gmail.com>
Cc: Aliaksey Kandratsenka <alkondratenko@gmail.com>,
Shaohua Li <shli@fb.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-api@vger.kernel.org,
Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
Mel Gorman <mel@csn.ul.ie>, Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
Andy Lutomirski <luto@amacapital.net>,
"google-perftools@googlegroups.com"
<google-perftools@googlegroups.com>
Subject: Re: [PATCH] mremap: add MREMAP_NOHOLE flag --resend
Date: Thu, 26 Mar 2015 16:02:22 +0900 [thread overview]
Message-ID: <20150326070221.GA26725@blaptop> (raw)
In-Reply-To: <55135F06.4000906@gmail.com>
On Wed, Mar 25, 2015 at 09:21:10PM -0400, Daniel Micay wrote:
> > I didn't follow this thread. However, as you mentioned MADV_FREE will
> > make many page fault, I jump into here.
> > One of the benefit with MADV_FREE in current implementation is to
> > avoid page fault as well as no zeroing.
> > Why did you see many page fault?
>
> I think I just misunderstood why it was still so much slower than not
> using purging at all.
>
> >> I get ~20k requests/s with jemalloc on the ebizzy benchmark with this
> >> dual core ivy bridge laptop. It jumps to ~60k requests/s with MADV_FREE
> >> IIRC, but disabling purging via MALLOC_CONF=lg_dirty_mult:-1 leads to
> >> 3.5 *million* requests/s. It has a similar impact with TCMalloc.
> >
> > When I tested MADV_FREE with ebizzy, I saw similar result two or three
> > times fater than MADV_DONTNEED. But It's no free cost. It incurs MADV_FREE
> > cost itself*(ie, enumerating all of page table in the range and clear
> > dirty bit and tlb flush). Of course, it has mmap_sem with read-side lock.
> > If you see great improve when you disable purging, I guess mainly it's
> > caused by no lock of mmap_sem so some threads can allocate while other
> > threads can do page fault. The reason I think so is I saw similar result
> > when I implemented vrange syscall which hold mmap_sem read-side lock
> > during very short time(ie, marking the volatile into vma, ie O(1) while
> > MADV_FREE holds a lock during enumerating all of pages in the range, ie O(N))
>
> It stops doing mmap after getting warmed up since it never unmaps so I
> don't think mmap_sem is a contention issue. It could just be caused by
> the cost of the system call itself and TLB flush. I found perf to be
> fairly useless in identifying where the time was being spent.
>
> It might be much more important to purge very large ranges in one go
> with MADV_FREE. It's a different direction than the current compromises
> forced by MADV_DONTNEED.
>
I tested ebizzy + recent jemalloc in my KVM guest.
Apparently, no purging was best(ie, 4925 records/s) while purging with
MADV_DONTNEED was worst(ie, 1814 records/s).
However, in my machine, purging with MADV_FREE was not bad as yourr.
4338 records/s vs 4925 records/s.
Still, no purging was win but if we consider the num of madvise syscall
between no purging and MADV_FREE purging, it would be better than now.
0 vs 43724
One thing I am wondering is why the madvise syscall count is increased
when we turns on MADV_FREE compared to MADV_DONTNEED. It might be
aggressive dirty puring rule in jemalloc internal?
Anyway, my point is gap between MADV_FREE and no puring in my machine
is not much like you said.
********
#> lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Stepping: 7
CPU MHz: 1200.000
BogoMIPS: 6399.71
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K
NUMA node0 CPU(s): 0-11
*****
ebizzy 0.2
(C) 2006-7 Intel Corporation
(C) 2007 Valerie Henson <val@nmt.edu>
always_mmap 0
never_mmap 0
chunks 10
prevent coalescing using permissions 0
prevent coalescing using holes 0
random_size 0
chunk_size 5242880
seconds 10
threads 24
verbose 1
linear 0
touch_pages 0
page size 4096
Allocated memory
Wrote memory
Threads starting
Threads finished
******
jemalloc git head
commit 65db63cf3f0c5dd5126a1b3786756486eaf931ba
Author: Jason Evans <je@fb.com>
Date: Wed Mar 25 18:56:55 2015 -0700
Fix in-place shrinking huge reallocation purging bugs.
******
1) LD_PRELOAD="/jemalloc/lib/libjemalloc.so.dontneed" strace -c -f ./ebizzy -s $((5<<20))
1814 records/s
real 10.00 s
user 28.18 s
sys 90.08 s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
90.78 99.368420 5469 18171 madvise
9.14 10.001131 10001131 1 nanosleep
0.05 0.050037 807 62 10 futex
0.03 0.031721 291 109 mmap
0.00 0.004455 178 25 set_robust_list
0.00 0.000129 5 24 clone
0.00 0.000000 0 4 read
0.00 0.000000 0 1 write
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
0.00 0.000000 0 6 fstat
0.00 0.000000 0 32 mprotect
0.00 0.000000 0 35 munmap
0.00 0.000000 0 2 brk
0.00 0.000000 0 3 rt_sigaction
0.00 0.000000 0 3 rt_sigprocmask
0.00 0.000000 0 4 3 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 1 readlink
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getrusage
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00 109.455893 18501 14 total
2) LD_PRELOAD="/jemalloc/lib/libjemalloc.so.dontneed" MALLOC_CONF=lg_dirty_mult:-1 strace -c -f ./ebizzy -s $((5<<20))
4925 records/s
real 10.00 s
user 119.83 s
sys 0.16 s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
82.73 0.821804 15804 52 6 futex
15.70 0.156000 156000 1 nanosleep
1.53 0.015186 115 132 mmap
0.04 0.000349 4 87 munmap
0.00 0.000000 0 4 read
0.00 0.000000 0 1 write
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
0.00 0.000000 0 6 fstat
0.00 0.000000 0 32 mprotect
0.00 0.000000 0 2 brk
0.00 0.000000 0 3 rt_sigaction
0.00 0.000000 0 3 rt_sigprocmask
0.00 0.000000 0 4 3 access
0.00 0.000000 0 24 madvise
0.00 0.000000 0 24 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 1 readlink
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getrusage
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 25 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 0.993339 419 10 total
3) LD_PRELOAD="/jemalloc/lib/libjemalloc.so.free" strace -c -f ./ebizzy -s $((5<<20))
4338 records/s
real 10.00 s
user 91.40 s
sys 12.58 s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
78.39 36.433483 839 43408 madvise
21.53 10.004889 10004889 1 nanosleep
0.04 0.020472 394 52 15 futex
0.03 0.015464 145 107 mmap
0.00 0.000041 2 24 clone
0.00 0.000000 0 4 read
0.00 0.000000 0 1 write
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
0.00 0.000000 0 6 fstat
0.00 0.000000 0 32 mprotect
0.00 0.000000 0 33 munmap
0.00 0.000000 0 2 brk
0.00 0.000000 0 3 rt_sigaction
0.00 0.000000 0 3 rt_sigprocmask
0.00 0.000000 0 4 3 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 1 readlink
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getrusage
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 25 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 46.474349 43724 19 total
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2015-03-26 7:02 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-17 21:09 [PATCH] mremap: add MREMAP_NOHOLE flag --resend Shaohua Li
2015-03-18 22:31 ` Andrew Morton
2015-03-19 5:08 ` Shaohua Li
2015-03-19 5:22 ` Andrew Morton
2015-03-19 16:38 ` Shaohua Li
2015-03-19 5:34 ` Daniel Micay
2015-03-22 6:06 ` Aliaksey Kandratsenka
2015-03-22 7:22 ` Daniel Micay
2015-03-24 4:36 ` Aliaksey Kandratsenka
2015-03-24 14:54 ` Daniel Micay
2015-03-25 16:22 ` Vlastimil Babka
2015-03-25 20:49 ` Daniel Micay
2015-03-25 20:54 ` Daniel Micay
2015-03-26 0:19 ` David Rientjes
2015-03-26 0:24 ` Daniel Micay
2015-03-26 2:31 ` David Rientjes
2015-03-26 3:24 ` Daniel Micay
2015-03-26 3:36 ` Daniel Micay
2015-03-26 17:25 ` Vlastimil Babka
2015-03-26 20:45 ` Daniel Micay
2015-03-23 5:17 ` Shaohua Li
2015-03-24 5:25 ` Aliaksey Kandratsenka
2015-03-24 14:39 ` Daniel Micay
2015-03-25 5:02 ` Shaohua Li
2015-03-26 0:50 ` Minchan Kim
2015-03-26 1:21 ` Daniel Micay
2015-03-26 7:02 ` Minchan Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150326070221.GA26725@blaptop \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alkondratenko@gmail.com \
--cc=danielmicay@gmail.com \
--cc=google-perftools@googlegroups.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=shli@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).