All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@google.com>
To: kvmarm@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	Ben Gardon <bgardon@google.com>, Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling
Date: Sat, 16 Apr 2022 06:23:46 +0000	[thread overview]
Message-ID: <Ylpg8l0BDlNpDWjs@google.com> (raw)
In-Reply-To: <20220415215901.1737897-1-oupton@google.com>

On Fri, Apr 15, 2022 at 09:58:44PM +0000, Oliver Upton wrote:

[...]

> 
> Smoke tested with KVM selftests + kvm_page_table_test w/ 2M hugetlb to
> exercise the table pruning code. Haven't done anything beyond this,
> sending as an RFC now to get eyes on the code.

Ok, got around to testing this thing a bit harder. Keep in mind that
permission faults at PAGE_SIZE granularity already go on the read side
of the lock. I used the dirty_log_perf_test with 4G/vCPU and anonymous
THP all the way up to 48 vCPUs. Here is the data as it compares to
5.18-rc2.

Dirty log time (split 2M -> 4K):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.83s    | 0.85s             |
|     2 | 0.95s    | 1.07s             |
|     4 | 2.65s    | 1.13s             |
|     8 | 4.88s    | 1.33s             |
|    16 | 9.71s    | 1.73s             |
|    32 | 20.43s   | 3.99s             |
|    48 | 29.15s   | 6.28s             |
+-------+----------+-------------------+

The scaling of prefaulting pass looks better too (same config):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.42s    | 0.18s             |
|     2 | 0.55s    | 0.19s             |
|     4 | 0.79s    | 0.27s             |
|     8 | 1.29s    | 0.35s             |
|    16 | 2.03s    | 0.53s             |
|    32 | 4.03s    | 1.01s             |
|    48 | 6.10s    | 1.51s             |
+-------+----------+-------------------+

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Oliver Upton <oupton@google.com>
To: kvmarm@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Peter Shier <pshier@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Reiji Watanabe <reijiw@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Ben Gardon <bgardon@google.com>,
	David Matlack <dmatlack@google.com>
Subject: Re: [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling
Date: Sat, 16 Apr 2022 06:23:46 +0000	[thread overview]
Message-ID: <Ylpg8l0BDlNpDWjs@google.com> (raw)
In-Reply-To: <20220415215901.1737897-1-oupton@google.com>

On Fri, Apr 15, 2022 at 09:58:44PM +0000, Oliver Upton wrote:

[...]

> 
> Smoke tested with KVM selftests + kvm_page_table_test w/ 2M hugetlb to
> exercise the table pruning code. Haven't done anything beyond this,
> sending as an RFC now to get eyes on the code.

Ok, got around to testing this thing a bit harder. Keep in mind that
permission faults at PAGE_SIZE granularity already go on the read side
of the lock. I used the dirty_log_perf_test with 4G/vCPU and anonymous
THP all the way up to 48 vCPUs. Here is the data as it compares to
5.18-rc2.

Dirty log time (split 2M -> 4K):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.83s    | 0.85s             |
|     2 | 0.95s    | 1.07s             |
|     4 | 2.65s    | 1.13s             |
|     8 | 4.88s    | 1.33s             |
|    16 | 9.71s    | 1.73s             |
|    32 | 20.43s   | 3.99s             |
|    48 | 29.15s   | 6.28s             |
+-------+----------+-------------------+

The scaling of prefaulting pass looks better too (same config):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.42s    | 0.18s             |
|     2 | 0.55s    | 0.19s             |
|     4 | 0.79s    | 0.27s             |
|     8 | 1.29s    | 0.35s             |
|    16 | 2.03s    | 0.53s             |
|    32 | 4.03s    | 1.01s             |
|    48 | 6.10s    | 1.51s             |
+-------+----------+-------------------+

--
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Oliver Upton <oupton@google.com>
To: kvmarm@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Peter Shier <pshier@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Reiji Watanabe <reijiw@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Ben Gardon <bgardon@google.com>,
	David Matlack <dmatlack@google.com>
Subject: Re: [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling
Date: Sat, 16 Apr 2022 06:23:46 +0000	[thread overview]
Message-ID: <Ylpg8l0BDlNpDWjs@google.com> (raw)
In-Reply-To: <20220415215901.1737897-1-oupton@google.com>

On Fri, Apr 15, 2022 at 09:58:44PM +0000, Oliver Upton wrote:

[...]

> 
> Smoke tested with KVM selftests + kvm_page_table_test w/ 2M hugetlb to
> exercise the table pruning code. Haven't done anything beyond this,
> sending as an RFC now to get eyes on the code.

Ok, got around to testing this thing a bit harder. Keep in mind that
permission faults at PAGE_SIZE granularity already go on the read side
of the lock. I used the dirty_log_perf_test with 4G/vCPU and anonymous
THP all the way up to 48 vCPUs. Here is the data as it compares to
5.18-rc2.

Dirty log time (split 2M -> 4K):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.83s    | 0.85s             |
|     2 | 0.95s    | 1.07s             |
|     4 | 2.65s    | 1.13s             |
|     8 | 4.88s    | 1.33s             |
|    16 | 9.71s    | 1.73s             |
|    32 | 20.43s   | 3.99s             |
|    48 | 29.15s   | 6.28s             |
+-------+----------+-------------------+

The scaling of prefaulting pass looks better too (same config):

+-------+----------+-------------------+
| vCPUs | 5.18-rc2 | 5.18-rc2 + series |
+-------+----------+-------------------+
|     1 | 0.42s    | 0.18s             |
|     2 | 0.55s    | 0.19s             |
|     4 | 0.79s    | 0.27s             |
|     8 | 1.29s    | 0.35s             |
|    16 | 2.03s    | 0.53s             |
|    32 | 4.03s    | 1.01s             |
|    48 | 6.10s    | 1.51s             |
+-------+----------+-------------------+

--
Thanks,
Oliver

  parent reply	other threads:[~2022-04-16  6:23 UTC|newest]

Thread overview: 165+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-15 21:58 [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling Oliver Upton
2022-04-15 21:58 ` Oliver Upton
2022-04-15 21:58 ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 01/17] KVM: arm64: Directly read owner id field in stage2_pte_is_counted() Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 02/17] KVM: arm64: Only read the pte once per visit Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 16:12   ` Ben Gardon
2022-04-21 16:12     ` Ben Gardon
2022-04-21 16:12     ` Ben Gardon
2022-04-15 21:58 ` [RFC PATCH 03/17] KVM: arm64: Return the next table from map callbacks Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 04/17] KVM: arm64: Protect page table traversal with RCU Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-19  2:55   ` Ricardo Koller
2022-04-19  2:55     ` Ricardo Koller
2022-04-19  2:55     ` Ricardo Koller
2022-04-19  3:01     ` Oliver Upton
2022-04-19  3:01       ` Oliver Upton
2022-04-19  3:01       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 05/17] KVM: arm64: Take an argument to indicate parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-16 11:30   ` Marc Zyngier
2022-04-16 11:30     ` Marc Zyngier
2022-04-16 11:30     ` Marc Zyngier
2022-04-16 16:03     ` Oliver Upton
2022-04-16 16:03       ` Oliver Upton
2022-04-16 16:03       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 06/17] KVM: arm64: Implement break-before-make sequence for parallel walks Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-20 16:55   ` Quentin Perret
2022-04-20 16:55     ` Quentin Perret
2022-04-20 16:55     ` Quentin Perret
2022-04-20 17:06     ` Oliver Upton
2022-04-20 17:06       ` Oliver Upton
2022-04-20 17:06       ` Oliver Upton
2022-04-21 16:57   ` Ben Gardon
2022-04-21 16:57     ` Ben Gardon
2022-04-21 16:57     ` Ben Gardon
2022-04-21 18:52     ` Oliver Upton
2022-04-21 18:52       ` Oliver Upton
2022-04-21 18:52       ` Oliver Upton
2022-04-26 21:32       ` Ben Gardon
2022-04-26 21:32         ` Ben Gardon
2022-04-26 21:32         ` Ben Gardon
2022-04-25 15:13   ` Sean Christopherson
2022-04-25 15:13     ` Sean Christopherson
2022-04-25 15:13     ` Sean Christopherson
2022-04-25 16:53     ` Oliver Upton
2022-04-25 16:53       ` Oliver Upton
2022-04-25 16:53       ` Oliver Upton
2022-04-25 18:16       ` Sean Christopherson
2022-04-25 18:16         ` Sean Christopherson
2022-04-25 18:16         ` Sean Christopherson
2022-04-15 21:58 ` [RFC PATCH 07/17] KVM: arm64: Enlighten perm relax path about " Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 08/17] KVM: arm64: Spin off helper for initializing table pte Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 09/17] KVM: arm64: Tear down unlinked page tables in parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 13:21   ` Quentin Perret
2022-04-21 13:21     ` Quentin Perret
2022-04-21 13:21     ` Quentin Perret
2022-04-21 16:40     ` Oliver Upton
2022-04-21 16:40       ` Oliver Upton
2022-04-21 16:40       ` Oliver Upton
2022-04-22 16:00       ` Quentin Perret
2022-04-22 16:00         ` Quentin Perret
2022-04-22 16:00         ` Quentin Perret
2022-04-22 20:41         ` Oliver Upton
2022-04-22 20:41           ` Oliver Upton
2022-04-22 20:41           ` Oliver Upton
2022-05-03 14:17           ` Quentin Perret
2022-05-03 14:17             ` Quentin Perret
2022-05-03 14:17             ` Quentin Perret
2022-05-04  6:03             ` Oliver Upton
2022-05-04  6:03               ` Oliver Upton
2022-05-04  6:03               ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 10/17] KVM: arm64: Assume a table pte is already owned in post-order traversal Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 16:11   ` Ben Gardon
2022-04-21 16:11     ` Ben Gardon
2022-04-21 16:11     ` Ben Gardon
2022-04-21 17:16     ` Oliver Upton
2022-04-21 17:16       ` Oliver Upton
2022-04-21 17:16       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 11/17] KVM: arm64: Move MMU cache init/destroy into helpers Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 12/17] KVM: arm64: Stuff mmu page cache in sub struct Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 13/17] KVM: arm64: Setup cache for stage2 page headers Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 14/17] KVM: arm64: Punt last page reference to rcu callback for parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-19  2:59   ` Ricardo Koller
2022-04-19  2:59     ` Ricardo Koller
2022-04-19  2:59     ` Ricardo Koller
2022-04-19  3:09     ` Ricardo Koller
2022-04-19  3:09       ` Ricardo Koller
2022-04-19  3:09       ` Ricardo Koller
2022-04-20  0:53       ` Oliver Upton
2022-04-20  0:53         ` Oliver Upton
2022-04-20  0:53         ` Oliver Upton
2022-09-08  0:52         ` David Matlack
2022-09-08  0:52           ` David Matlack
2022-09-08  0:52           ` David Matlack
2022-04-21 16:28   ` Ben Gardon
2022-04-21 16:28     ` Ben Gardon
2022-04-21 16:28     ` Ben Gardon
2022-04-15 21:58 ` [RFC PATCH 15/17] KVM: arm64: Allow parallel calls to kvm_pgtable_stage2_map() Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:59 ` [RFC PATCH 16/17] KVM: arm64: Enable parallel stage 2 MMU faults Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-21 16:35   ` Ben Gardon
2022-04-21 16:35     ` Ben Gardon
2022-04-21 16:35     ` Ben Gardon
2022-04-21 16:46     ` Oliver Upton
2022-04-21 16:46       ` Oliver Upton
2022-04-21 16:46       ` Oliver Upton
2022-04-21 17:03       ` Ben Gardon
2022-04-21 17:03         ` Ben Gardon
2022-04-21 17:03         ` Ben Gardon
2022-04-15 21:59 ` [RFC PATCH 17/17] TESTONLY: KVM: arm64: Add super lazy accounting of stage 2 table pages Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 23:35 ` [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling David Matlack
2022-04-15 23:35   ` David Matlack
2022-04-15 23:35   ` David Matlack
2022-04-16  0:04   ` Oliver Upton
2022-04-16  0:04     ` Oliver Upton
2022-04-16  0:04     ` Oliver Upton
2022-04-21 16:43     ` David Matlack
2022-04-21 16:43       ` David Matlack
2022-04-21 16:43       ` David Matlack
2022-04-16  6:23 ` Oliver Upton [this message]
2022-04-16  6:23   ` Oliver Upton
2022-04-16  6:23   ` Oliver Upton
2022-04-19 17:57 ` Ben Gardon
2022-04-19 17:57   ` Ben Gardon
2022-04-19 17:57   ` Ben Gardon
2022-04-19 18:36   ` Oliver Upton
2022-04-19 18:36     ` Oliver Upton
2022-04-19 18:36     ` Oliver Upton
2022-04-21 16:30     ` Ben Gardon
2022-04-21 16:30       ` Ben Gardon
2022-04-21 16:30       ` Ben Gardon
2022-04-21 16:37       ` Paolo Bonzini
2022-04-21 16:37         ` Paolo Bonzini
2022-04-21 16:37         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ylpg8l0BDlNpDWjs@google.com \
    --to=oupton@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.