All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin King <benjaminking@web.de>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: How to sample memory usage cheaply?
Date: Sat, 1 Apr 2017 09:41:12 +0200	[thread overview]
Message-ID: <20170401074112.GA8989@localhost> (raw)
In-Reply-To: <2288291.HPAjuFhd8F@agathebauer>

Hi!

On Fri, Mar 31, 2017 at 02:06:28PM +0200, Milian Wolff wrote:
>> I'd love to analyze page faults a bit before continuing with the more
>> expensive tracing of malloc and friends.
>
>I suggest you try out my heaptrack tool.

I did! Also the heap profiler from google perftools and a homegrown approach
using uprobes on malloc, free, mmap and munmap. My bcc-fu is too weak to do
me any good right now. The workload that I am working with takes a few hours
to generate, however, so more overhead means a lot of latency to observe the
effect of changes that I do.

I guess that the issue I am facing currently is a bit more fundamental, since I
don't have a good way to trace/sample instances where my process is using a
page of physical memory *for the first time*. Bonus points to detect when the
use count on behalf of my process drops to zero.

I'd like to tell these events apart from instances where I am using the same
physical page from more than one virtual page. It feels like there should be
a way to do so, but I don't know how.

Also, the 'perf stat' output that I sent does not make much sense to me right
now. For example, when I add MAP_POPULATE to the flags for mmap, I only see
~40 minor page faults, which I do not understand at all.

>> This makes sense, but how can I measure physical memory properly, then?
>> Parsing the Pss-Rows in /proc/<pid>/smaps does work, but seems a bit clumsy.
>> Is there a better way (e.g. with callstacks) to measure physical memory
>> growth for a process?
>
From what I understand, wouldn't you get this by tracing sbrk + mmap with call
>stacks?

No, not quite, since different threads in my process mmap the same file. This
counts them twice in terms of virtual memory but I'm interested in the load
on physical memory.


> Do note that file-backed mmaps can be shared across processes, so
>you'll have to take that into account as done for Pss.  But in my experience,
>when you want to improve a single application's memory consumption, it usually
>boils down to non-shared heap memory anyways. I.e. sbrk and anon mmaps.

True, and that's what I'm after, eventually, but the multiple mappings skew
the picture a bit right now.

To make progress, I need to figure out how to properly measure physical
memory usage more.

>But if you look at the callstacks for these syscalls, they usually point at
>the obvious places (i.e. mempools), but you won't see what is _actually_ using
>the memory of these pools. Heaptrack or massif is much better in that regard.

>
>Hope that helps, happy profiling!

Thanks,
  Benjamin


>
>> PS:
>>  Here is some measurement with a leaky toy program. It uses a 1GB
>> zero-filled file and drops file system caches prior to the measurement to
>> encourage major page faults for the first mapping only. Does not work at
>> all: -----
>> $ gcc -O0 mmap_faults.c
>> $ fallocate -z -l $((1<<30)) 1gb_of_garbage.dat
>> $ sudo sysctl -w vm.drop_caches=3
>> vm.drop_caches = 3
>> $ perf stat -eminor-faults,major-faults ./a.out
>>
>>  Performance counter stats for './a.out':
>>
>>             327,726      minor-faults
>>                   1      major-faults
>> $ cat mmap_faults.c
>> #include <fcntl.h>
>> #include <sys/mman.h>
>> #include <unistd.h>
>>
>> #define numMaps 20
>> #define length 1u<<30
>> #define path "1gb_of_garbage.dat"
>>
>> int main()
>> {
>>   int sum = 0;
>>   for ( int j = 0; j < numMaps; ++j )
>>   {
>>     const char *result =
>>       (const char*)mmap( NULL, length, PROT_READ, MAP_PRIVATE,
>>           open( path, O_RDONLY ), 0 );
>>
>>     for ( int i = 0; i < length; i += 4096 )
>>       sum += result[ i ];
>>   }
>>   return sum;
>> }
>> -----
>> Shouldn't I see ~5 Million page faults (20GB/4K)?
>> Shouldn't I see more major page faults?
>> Same thing when garbage-file is filled from /dev/urandom
>> Even weirder when MAP_POPULATE'ing the file
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-perf-users"
>> in the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>-- 
>Milian Wolff | milian.wolff@kdab.com | Software Engineer
>KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
>Tel: +49-30-521325470
>KDAB - The Qt Experts

  reply	other threads:[~2017-04-01  7:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 20:04 How to sample memory usage cheaply? Benjamin King
2017-03-31 12:06 ` Milian Wolff
2017-04-01  7:41   ` Benjamin King [this message]
2017-04-01 13:54     ` Vince Weaver
2017-04-01 16:27       ` Benjamin King
2017-04-03 19:09 ` Benjamin King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170401074112.GA8989@localhost \
    --to=benjaminking@web.de \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=milian.wolff@kdab.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.