qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Sanjay Rao <srao@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org,  Boaz Ben Shabat <bbenshab@redhat.com>,
	Joe Mario <jmario@redhat.com>
Subject: Re: [PATCH] coroutine: cap per-thread local pool size
Date: Tue, 19 Mar 2024 10:23:48 -0400	[thread overview]
Message-ID: <CACt6rRD0AMVOh5TKLy+g8wGv_r_egpkbBZE3SO6F0EhskevCqw@mail.gmail.com> (raw)
In-Reply-To: <ZfmT1s8hcW48KIn1@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4141 bytes --]

On Tue, Mar 19, 2024 at 9:32 AM Kevin Wolf <kwolf@redhat.com> wrote:

> Am 18.03.2024 um 19:34 hat Stefan Hajnoczi geschrieben:
> > The coroutine pool implementation can hit the Linux vm.max_map_count
> > limit, causing QEMU to abort with "failed to allocate memory for stack"
> > or "failed to set up stack guard page" during coroutine creation.
> >
> > This happens because per-thread pools can grow to tens of thousands of
> > coroutines. Each coroutine causes 2 virtual memory areas to be created.
> > Eventually vm.max_map_count is reached and memory-related syscalls fail.
> > The per-thread pool sizes are non-uniform and depend on past coroutine
> > usage in each thread, so it's possible for one thread to have a large
> > pool while another thread's pool is empty.
> >
> > Switch to a new coroutine pool implementation with a global pool that
> > grows to a maximum number of coroutines and per-thread local pools that
> > are capped at hardcoded small number of coroutines.
> >
> > This approach does not leave large numbers of coroutines pooled in a
> > thread that may not use them again. In order to perform well it
> > amortizes the cost of global pool accesses by working in batches of
> > coroutines instead of individual coroutines.
> >
> > The global pool is a list. Threads donate batches of coroutines to when
> > they have too many and take batches from when they have too few:
> >
> > .-----------------------------------.
> > | Batch 1 | Batch 2 | Batch 3 | ... | global_pool
> > `-----------------------------------'
> >
> > Each thread has up to 2 batches of coroutines:
> >
> > .-------------------.
> > | Batch 1 | Batch 2 | per-thread local_pool (maximum 2 batches)
> > `-------------------'
> >
> > The goal of this change is to reduce the excessive number of pooled
> > coroutines that cause QEMU to abort when vm.max_map_count is reached
> > without losing the performance of an adequately sized coroutine pool.
> >
> > Here are virtio-blk disk I/O benchmark results:
> >
> >       RW BLKSIZE IODEPTH    OLD    NEW CHANGE
> > randread      4k       1 113725 117451 +3.3%
> > randread      4k       8 192968 198510 +2.9%
> > randread      4k      16 207138 209429 +1.1%
> > randread      4k      32 212399 215145 +1.3%
> > randread      4k      64 218319 221277 +1.4%
> > randread    128k       1  17587  17535 -0.3%
> > randread    128k       8  17614  17616 +0.0%
> > randread    128k      16  17608  17609 +0.0%
> > randread    128k      32  17552  17553 +0.0%
> > randread    128k      64  17484  17484 +0.0%
> >
> > See files/{fio.sh,test.xml.j2} for the benchmark configuration:
> >
> https://gitlab.com/stefanha/virt-playbooks/-/tree/coroutine-pool-fix-sizing
> >
> > Buglink: https://issues.redhat.com/browse/RHEL-28947
> > Reported-by: Sanjay Rao <srao@redhat.com>
> > Reported-by: Boaz Ben Shabat <bbenshab@redhat.com>
> > Reported-by: Joe Mario <jmario@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
>
> Though I do wonder if we can do something about the slight performance
> degradation that Sanjay reported. We seem to stay well under the hard
> limit, so the reduced global pool size shouldn't be the issue. Maybe
> it's the locking?
>
> We are only seeing a slight fall off from our much improved numbers with
the addition of iothreads. I am not very concerned. With database
workloads, there's always a run to run variation. Especially when there's a
lot of idle cpus on the host. To reduce the run to run variation, we use
cpu / numa pinning and other methods like pci passthru. If I get a chance,
I will do some runs with cpu pinning to see what the numbers look like.



> Either way, even though it could be called a fix, I don't think this is
> for 9.0, right?
>
> Kevin
>
>

-- 

Sanjay Rao
Sr. Principal Performance Engineer                   Phone: 978-392-2479
Red Hat, Inc.                                    FAX: 978-392-1001
314 Littleton Road                               Email: srao@redhat.com
Westford, MA 01886

[-- Attachment #2: Type: text/html, Size: 5867 bytes --]

  parent reply	other threads:[~2024-03-19 14:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-18 18:34 [PATCH] coroutine: cap per-thread local pool size Stefan Hajnoczi
2024-03-19 13:32 ` Kevin Wolf
2024-03-19 13:45   ` Stefan Hajnoczi
2024-03-19 14:23   ` Sanjay Rao [this message]
2024-03-19 13:43 ` Daniel P. Berrangé
2024-03-19 16:54   ` Kevin Wolf
2024-03-19 17:10     ` Daniel P. Berrangé
2024-03-19 17:41       ` Kevin Wolf
2024-03-19 20:14         ` Daniel P. Berrangé
2024-03-19 17:55   ` Stefan Hajnoczi
2024-03-19 20:10     ` Daniel P. Berrangé
2024-03-20 13:35       ` Stefan Hajnoczi
2024-03-20 14:09         ` Daniel P. Berrangé
2024-03-21 12:21           ` Kevin Wolf
2024-03-21 16:59             ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACt6rRD0AMVOh5TKLy+g8wGv_r_egpkbBZE3SO6F0EhskevCqw@mail.gmail.com \
    --to=srao@redhat.com \
    --cc=bbenshab@redhat.com \
    --cc=jmario@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).