Re: How to sample memory usage cheaply?

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Benjamin King <benjaminking@web.de>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: How to sample memory usage cheaply?
Date: Sat, 1 Apr 2017 09:41:12 +0200	[thread overview]
Message-ID: <20170401074112.GA8989@localhost> (raw)
In-Reply-To: <2288291.HPAjuFhd8F@agathebauer>

Hi!

On Fri, Mar 31, 2017 at 02:06:28PM +0200, Milian Wolff wrote:
>> I'd love to analyze page faults a bit before continuing with the more
>> expensive tracing of malloc and friends.
>
>I suggest you try out my heaptrack tool.

I did! Also the heap profiler from google perftools and a homegrown approach
using uprobes on malloc, free, mmap and munmap. My bcc-fu is too weak to do
me any good right now. The workload that I am working with takes a few hours
to generate, however, so more overhead means a lot of latency to observe the
effect of changes that I do.

I guess that the issue I am facing currently is a bit more fundamental, since I
don't have a good way to trace/sample instances where my process is using a
page of physical memory *for the first time*. Bonus points to detect when the
use count on behalf of my process drops to zero.

I'd like to tell these events apart from instances where I am using the same
physical page from more than one virtual page. It feels like there should be
a way to do so, but I don't know how.

Also, the 'perf stat' output that I sent does not make much sense to me right
now. For example, when I add MAP_POPULATE to the flags for mmap, I only see
~40 minor page faults, which I do not understand at all.

>> This makes sense, but how can I measure physical memory properly, then?
>> Parsing the Pss-Rows in /proc/<pid>/smaps does work, but seems a bit clumsy.
>> Is there a better way (e.g. with callstacks) to measure physical memory
>> growth for a process?
>
From what I understand, wouldn't you get this by tracing sbrk + mmap with call
>stacks?

No, not quite, since different threads in my process mmap the same file. This
counts them twice in terms of virtual memory but I'm interested in the load
on physical memory.

> Do note that file-backed mmaps can be shared across processes, so
>you'll have to take that into account as done for Pss.  But in my experience,
>when you want to improve a single application's memory consumption, it usually
>boils down to non-shared heap memory anyways. I.e. sbrk and anon mmaps.

True, and that's what I'm after, eventually, but the multiple mappings skew
the picture a bit right now.

To make progress, I need to figure out how to properly measure physical
memory usage more.

>But if you look at the callstacks for these syscalls, they usually point at
>the obvious places (i.e. mempools), but you won't see what is _actually_ using
>the memory of these pools. Heaptrack or massif is much better in that regard.

>
>Hope that helps, happy profiling!

Thanks,
  Benjamin

>
>> PS:
>>  Here is some measurement with a leaky toy program. It uses a 1GB
>> zero-filled file and drops file system caches prior to the measurement to
>> encourage major page faults for the first mapping only. Does not work at
>> all: -----
>> $ gcc -O0 mmap_faults.c
>> $ fallocate -z -l $((1<<30)) 1gb_of_garbage.dat
>> $ sudo sysctl -w vm.drop_caches=3
>> vm.drop_caches = 3
>> $ perf stat -eminor-faults,major-faults ./a.out
>>
>>  Performance counter stats for './a.out':
>>
>>             327,726      minor-faults
>>                   1      major-faults
>> $ cat mmap_faults.c
>> #include <fcntl.h>
>> #include <sys/mman.h>
>> #include <unistd.h>
>>
>> #define numMaps 20
>> #define length 1u<<30
>> #define path "1gb_of_garbage.dat"
>>
>> int main()
>> {
>>   int sum = 0;
>>   for ( int j = 0; j < numMaps; ++j )
>>   {
>>     const char *result =
>>       (const char*)mmap( NULL, length, PROT_READ, MAP_PRIVATE,
>>           open( path, O_RDONLY ), 0 );
>>
>>     for ( int i = 0; i < length; i += 4096 )
>>       sum += result[ i ];
>>   }
>>   return sum;
>> }
>> -----
>> Shouldn't I see ~5 Million page faults (20GB/4K)?
>> Shouldn't I see more major page faults?
>> Same thing when garbage-file is filled from /dev/urandom
>> Even weirder when MAP_POPULATE'ing the file
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-perf-users"
>> in the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>-- 
>Milian Wolff | milian.wolff@kdab.com | Software Engineer
>KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
>Tel: +49-30-521325470
>KDAB - The Qt Experts

next prev parent reply	other threads:[~2017-04-01  7:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 20:04 How to sample memory usage cheaply? Benjamin King
2017-03-31 12:06 ` Milian Wolff
2017-04-01  7:41   ` Benjamin King [this message]
2017-04-01 13:54     ` Vince Weaver
2017-04-01 16:27       ` Benjamin King
2017-04-03 19:09 ` Benjamin King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170401074112.GA8989@localhost \
    --to=benjaminking@web.de \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=milian.wolff@kdab.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).