API/syscall to alleviate page/memory problem when quickly accessing memory?

linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Levo D <l-asm@mail9fcb1a.bolinlang.com>
To: <linux-api@vger.kernel.org>
Subject: API/syscall to alleviate page/memory problem when quickly accessing memory?
Date: Mon,  3 Apr 2023 02:31:35 +0000 (UTC)	[thread overview]
Message-ID: <20230403023135.E0D9E1777C2@bolin> (raw)

I optimized and profile my program to a point that it seems like it's spending more time in kernel than in userspace (likely not true but I'll explain).

Here's one run. I spawn many threads (6 at minimum, more depending on flags). As you can see more than half of the total time is in sys. Is the kernel running on multiple cores simultaneously to give my program pages?

real	0m0.954s
user	0m6.442s
sys 	0m0.607s

The test below is using -test-flags which gets me these numbers, sys is 51% of total time

real	0m0.733s
user	0m3.476s
sys 	0m0.378s

perf record -F 5000 ./myapp -test-flags shows me 61% of the app is in my biggest function and 6% is in `clear_page_rep`. When I record cache misses using `perf record -F 5000 --call-graph=fp -e cache-misses ./myapp -test-flags` I can see that

clear_page_rep takes 40%
clear_huge_page takes 1.2%
My big function self is 8%, while total is 25.5%. the remaining is mostly asm_exc_page_fault (12%) and asm_sysvec_apic_timer_interrupt (2.7%)
That's about 56% (of all misses and waiting) in the kernel

I believe if I can reduce work being done in the kernel and have pages be ready before I fault I'll have less cache misses in my large function and I could be significantly faster. I measured how long my large function takes in single threaded compared to multi. Multithreaded at minimum is 1.5x slower to 2x slower. I spawn 1 thread per core (I'm testing on a zen2, it has 6cores with 12threads, spawning more than 6 threads slow the program down). Each thread is using <100MB.

Is there an API I should look into? What can I do here?

next             reply	other threads:[~2023-04-03  2:45 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03  2:31 Levo D [this message]
2023-04-03  6:30 ` API/syscall to alleviate page/memory problem when quickly accessing memory? Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230403023135.E0D9E1777C2@bolin \
    --to=l-asm@mail9fcb1a.bolinlang.com \
    --cc=linux-api@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).