From: Peter Xu <peterx@redhat.com>
To: Anish Moorthy <amoorthy@google.com>
Cc: Nadav Amit <nadav.amit@gmail.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
maz@kernel.org, oliver.upton@linux.dev,
Sean Christopherson <seanjc@google.com>,
James Houghton <jthoughton@google.com>,
bgardon@google.com, dmatlack@google.com, ricarkol@google.com,
kvm <kvm@vger.kernel.org>,
kvmarm@lists.linux.dev
Subject: Re: [PATCH v3 00/22] Improve scalability of KVM + userfaultfd live migration via annotated memory faults.
Date: Thu, 27 Apr 2023 16:26:44 -0400 [thread overview]
Message-ID: <ZErahL/7DKimG+46@x1n> (raw)
In-Reply-To: <CAF7b7mr-_U6vU1iOwukdmOoaT0G1ttyxD62cv=vebnQeXL3R0w@mail.gmail.com>
Hi, Anish,
On Mon, Apr 24, 2023 at 05:15:49PM -0700, Anish Moorthy wrote:
> On Mon, Apr 24, 2023 at 12:44 PM Nadav Amit <nadav.amit@gmail.com> wrote:
> >
> >
> >
> > > On Apr 24, 2023, at 10:54 AM, Anish Moorthy <amoorthy@google.com> wrote:
> > >
> > > On Fri, Apr 21, 2023 at 10:40 AM Nadav Amit <nadav.amit@gmail.com> wrote:
> > >>
> > >> If I understand the problem correctly, it sounds as if the proper solution
> > >> should be some kind of a range-locks. If it is too heavy or the interface can
> > >> be changed/extended to wake a single address (instead of a range),
> > >> simpler hashed-locks can be used.
> > >
> > > Some sort of range-based locking system does seem relevant, although I
> > > don't see how that would necessarily speed up the delivery of faults
> > > to UFFD readers: I'll have to think about it more.
> >
> > Perhaps I misread your issue. Based on the scalability issues you raised,
> > I assumed that the problem you encountered is related to lock contention.
> > I do not know whether your profiled it, but some information would be
> > useful.
>
> No, you had it right: the issue at hand is contention on the uffd wait
> queues. I'm just not sure what the range-based locking would really be
> doing. Events would still have to be delivered to userspace in an
> ordered manner, so it seems to me that each uffd would still need to
> maintain a queue (and the associated contention).
>
> With respect to the "sharding" idea, I collected some more runs of the
> self test (full command in [1]). This time I omitted the "-a" flag, so
> that every vCPU accesses a different range of guest memory with its
> own UFFD, and set the number of reader threads per UFFD to 1.
>
> vCPUs, Average Paging Rate (w/o new caps), Average Paging Rate (w/ new caps)
> 1 180 307
> 2 85 220
> 4 80 206
> 8 39 163
> 16 18 104
> 32 8 73
> 64 4 57
> 128 1 37
> 256 1 16
>
> I'm reporting paging rate on a per-vcpu rather than total basis, which
> is why the numbers look so different than the ones in the cover
> letter. I'm actually not sure why the demand paging rate falls off
> with the number of vCPUs (maybe a prioritization issue on my side?),
> but even when UFFDs aren't being contended for it's clear that demand
> paging via memory fault exits is significantly faster.
>
> I'll try to get some perf traces as well: that will take a little bit
> of time though, as to do it for cycler will involve patching our VMM
> first.
>
> [1] ./demand_paging_test -b 64M -u MINOR -s shmem -v <n> -r 1 [-w]
Thanks (for doing this test, and also to Nadav for all his inputs), and
sorry for a late response.
These numbers caught my eye, and I'm very curious why even 2 vcpus can
scale that bad.
I gave it a shot on a test machine and I got something slightly different:
Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (20 cores, 40 threads)
$ ./demand_paging_test -b 512M -u MINOR -s shmem -v N
|-------+----------+--------|
| n_thr | per-vcpu | total |
|-------+----------+--------|
| 1 | 39.5K | 39.5K |
| 2 | 33.8K | 67.6K |
| 4 | 31.8K | 127.2K |
| 8 | 30.8K | 246.1K |
| 16 | 21.9K | 351.0K |
|-------+----------+--------|
I used larger ram due to less cores. I didn't try 32+ vcpus to make sure I
don't have two threads content on a core/thread already since I only got 40
hardware threads there, but still we can compare with your lower half.
When I was testing I noticed bad numbers and another bug on not using
NSEC_PER_SEC properly, so I did this before the test:
https://lore.kernel.org/all/20230427201112.2164776-1-peterx@redhat.com/
I think it means it still doesn't scale that good, however not so bad
either - no obvious 1/2 drop on using 2vcpus. There're still a bunch of
paths triggered in the test so I also don't expect it to fully scale
linearly. From my numbers I just didn't see as drastic as yours. I'm not
sure whether it's simply broken test number, parameter differences
(e.g. you used 64M only per-vcpu), or hardware differences.
--
Peter Xu
next prev parent reply other threads:[~2023-04-27 20:27 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-12 21:34 [PATCH v3 00/22] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 01/22] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test Anish Moorthy
2023-04-19 13:51 ` Hoo Robert
2023-04-20 17:55 ` Anish Moorthy
2023-04-21 12:15 ` Robert Hoo
2023-04-21 16:21 ` Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 02/22] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-04-19 13:36 ` Hoo Robert
2023-04-19 23:26 ` Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 03/22] KVM: Allow hva_pfn_fast() to resolve read-only faults Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 04/22] KVM: x86: Set vCPU exit reason to KVM_EXIT_UNKNOWN at the start of KVM_RUN Anish Moorthy
2023-05-02 17:17 ` Anish Moorthy
2023-05-02 18:51 ` Sean Christopherson
2023-05-02 19:49 ` Anish Moorthy
2023-05-02 20:41 ` Sean Christopherson
2023-05-02 21:46 ` Anish Moorthy
2023-05-02 22:31 ` Sean Christopherson
2023-04-12 21:34 ` [PATCH v3 05/22] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Anish Moorthy
2023-04-19 13:57 ` Hoo Robert
2023-04-20 18:09 ` Anish Moorthy
2023-04-21 12:28 ` Robert Hoo
2023-06-01 19:52 ` Oliver Upton
2023-06-01 20:30 ` Anish Moorthy
2023-06-01 21:29 ` Oliver Upton
2023-07-04 10:10 ` Kautuk Consul
2023-04-12 21:34 ` [PATCH v3 06/22] KVM: Add docstrings to __kvm_write_guest_page() and __kvm_read_guest_page() Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 07/22] KVM: Annotate -EFAULTs from kvm_vcpu_write_guest_page() Anish Moorthy
2023-04-20 20:52 ` Peter Xu
2023-04-20 23:29 ` Anish Moorthy
2023-04-21 15:00 ` Peter Xu
2023-04-12 21:34 ` [PATCH v3 08/22] KVM: Annotate -EFAULTs from kvm_vcpu_read_guest_page() Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 09/22] KVM: Annotate -EFAULTs from kvm_vcpu_map() Anish Moorthy
2023-04-20 20:53 ` Peter Xu
2023-04-20 23:34 ` Anish Moorthy
2023-04-21 14:58 ` Peter Xu
2023-04-12 21:34 ` [PATCH v3 10/22] KVM: x86: Annotate -EFAULTs from kvm_mmu_page_fault() Anish Moorthy
2023-04-12 21:34 ` [PATCH v3 11/22] KVM: x86: Annotate -EFAULTs from setup_vmgexit_scratch() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 12/22] KVM: x86: Annotate -EFAULTs from kvm_handle_page_fault() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 13/22] KVM: x86: Annotate -EFAULTs from kvm_hv_get_assist_page() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 14/22] KVM: x86: Annotate -EFAULTs from kvm_pv_clock_pairing() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 15/22] KVM: x86: Annotate -EFAULTs from direct_map() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 16/22] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 17/22] KVM: Introduce KVM_CAP_ABSENT_MAPPING_FAULT without implementation Anish Moorthy
2023-04-19 14:00 ` Hoo Robert
2023-04-20 18:23 ` Anish Moorthy
2023-04-24 21:02 ` Sean Christopherson
2023-06-01 16:04 ` Oliver Upton
2023-06-01 18:19 ` Oliver Upton
2023-06-01 18:59 ` Sean Christopherson
2023-06-01 19:29 ` Oliver Upton
2023-06-01 19:34 ` Sean Christopherson
2023-04-12 21:35 ` [PATCH v3 18/22] KVM: x86: Implement KVM_CAP_ABSENT_MAPPING_FAULT Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 19/22] KVM: arm64: Annotate (some) -EFAULTs from user_mem_abort() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 20/22] KVM: arm64: Implement KVM_CAP_ABSENT_MAPPING_FAULT Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 21/22] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-04-12 21:35 ` [PATCH v3 22/22] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-04-19 14:09 ` Hoo Robert
2023-04-19 16:40 ` Anish Moorthy
2023-04-20 22:47 ` Anish Moorthy
2023-04-27 15:48 ` James Houghton
2023-05-01 18:01 ` Anish Moorthy
2023-04-19 19:55 ` [PATCH v3 00/22] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Peter Xu
2023-04-19 20:15 ` Axel Rasmussen
2023-04-19 21:05 ` Peter Xu
[not found] ` <CAF7b7mo68VLNp=QynfT7QKgdq=d1YYGv1SEVEDxF9UwHzF6YDw@mail.gmail.com>
2023-04-20 21:29 ` Peter Xu
2023-04-21 16:58 ` Anish Moorthy
2023-04-21 17:39 ` Nadav Amit
2023-04-24 17:54 ` Anish Moorthy
2023-04-24 19:44 ` Nadav Amit
2023-04-24 20:35 ` Sean Christopherson
2023-04-24 23:47 ` Nadav Amit
2023-04-25 0:26 ` Sean Christopherson
2023-04-25 0:37 ` Nadav Amit
2023-04-25 0:15 ` Anish Moorthy
2023-04-25 0:54 ` Nadav Amit
2023-04-27 16:38 ` James Houghton
2023-04-27 20:26 ` Peter Xu [this message]
2023-05-03 19:45 ` Anish Moorthy
2023-05-03 20:09 ` Sean Christopherson
[not found] ` <ZFLPlRReglM/Vgfu@x1n>
2023-05-03 21:27 ` Peter Xu
2023-05-03 21:42 ` Sean Christopherson
2023-05-03 23:45 ` Peter Xu
2023-05-04 19:09 ` Peter Xu
2023-05-05 18:32 ` Anish Moorthy
2023-05-08 1:23 ` Peter Xu
2023-05-09 20:52 ` Anish Moorthy
2023-05-10 21:50 ` Peter Xu
2023-05-11 17:17 ` David Matlack
2023-05-11 17:33 ` Axel Rasmussen
2023-05-11 19:05 ` David Matlack
2023-05-11 19:45 ` Axel Rasmussen
2023-05-15 15:16 ` Peter Xu
2023-05-15 15:05 ` Peter Xu
2023-05-15 17:16 ` Anish Moorthy
2023-05-05 20:05 ` Nadav Amit
2023-05-08 1:12 ` Peter Xu
2023-04-20 23:42 ` Anish Moorthy
2023-05-09 22:19 ` David Matlack
2023-05-10 16:35 ` Anish Moorthy
2023-05-10 22:35 ` Sean Christopherson
2023-05-10 23:44 ` Anish Moorthy
2023-05-23 17:49 ` Anish Moorthy
2023-06-01 22:43 ` Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZErahL/7DKimG+46@x1n \
--to=peterx@redhat.com \
--cc=amoorthy@google.com \
--cc=axelrasmussen@google.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=jthoughton@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=nadav.amit@gmail.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox