Re: [PATCH 2/3] KVM: selftests: Collect memory access latency samples

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ricardo Koller <ricarkol@google.com>
To: Colton Lewis <coltonlewis@google.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, maz@kernel.org,
	dmatlack@google.com, seanjc@google.com, bgardon@google.com,
	oupton@google.com
Subject: Re: [PATCH 2/3] KVM: selftests: Collect memory access latency samples
Date: Tue, 17 Jan 2023 12:43:35 -0800	[thread overview]
Message-ID: <Y8cIdxp5k8HivVAe@google.com> (raw)
In-Reply-To: <20221115173258.2530923-3-coltonlewis@google.com>

On Tue, Nov 15, 2022 at 05:32:57PM +0000, Colton Lewis wrote:
> Collect memory access latency measured in clock cycles.
> 
> This introduces a dependency on the timers for ARM and x86. No other
> architectures are implemented and their samples will all be 0.
> 
> Because keeping all samples is impractical due to the space required
> in some cases (pooled memory w/ 64 vcpus would be 64 GB/vcpu * 64
> vcpus * 250,000 samples/GB * 8 bytes/sample ~ 8 Gb extra memory just
> for samples), resevior sampling is used to only keep a small number of

nit: reservoir

> samples per vcpu (1000 samples in this patch).

Didn't see this before my previous comment. But, I guess it still
applies: isn't it possible to know the number of events to store?  to
avoid the "100" obtained via trial and error.

> 
> Resevoir sampling means despite keeping only a small number of
> samples, each sample has an equal chance of making it to the
> resevoir. Simple proofs of this can be found online. This makes the
> resevoir a good representation of the distribution of samples and

reservoir

> enables calculation of reasonably accurate percentiles.
> 
> All samples are stored in a statically allocated flat array for ease
> of combining them later. Samples are stored at an offset in this array
> calculated by the vcpu index (so vcpu 5 sample 10 would be stored at
> address sample_times + 5 * vcpu_idx + 10).
> 
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
>  .../selftests/kvm/lib/perf_test_util.c        | 34 +++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> index a48904b64e19..0311da76bae0 100644
> --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> @@ -4,6 +4,9 @@
>   */
>  #include <inttypes.h>
>  
> +#if defined(__aarch64__)
> +#include "aarch64/arch_timer.h"
> +#endif
>  #include "kvm_util.h"
>  #include "perf_test_util.h"
>  #include "processor.h"
> @@ -44,6 +47,18 @@ static struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
>  /* Store all samples in a flat array so they can be easily sorted later. */
>  uint64_t latency_samples[SAMPLE_CAPACITY];
>  
> +static uint64_t perf_test_timer_read(void)
> +{
> +#if defined(__aarch64__)
> +	return timer_get_cntct(VIRTUAL);
> +#elif defined(__x86_64__)
> +	return rdtsc();
> +#else
> +#warn __func__ " is not implemented for this architecture, will return 0"
> +	return 0;
> +#endif
> +}
> +
>  /*
>   * Continuously write to the first 8 bytes of each page in the
>   * specified region.
> @@ -59,6 +74,10 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  	int i;
>  	struct guest_random_state rand_state =
>  		new_guest_random_state(pta->random_seed + vcpu_idx);
> +	uint64_t *latency_samples_offset = latency_samples + SAMPLES_PER_VCPU * vcpu_idx;
> +	uint64_t count_before;
> +	uint64_t count_after;
> +	uint32_t maybe_sample;
>  
>  	gva = vcpu_args->gva;
>  	pages = vcpu_args->pages;
> @@ -75,10 +94,21 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  
>  			addr = gva + (page * pta->guest_page_size);
>  
> -			if (guest_random_u32(&rand_state) % 100 < pta->write_percent)
> +			if (guest_random_u32(&rand_state) % 100 < pta->write_percent) {
> +				count_before = perf_test_timer_read();
>  				*(uint64_t *)addr = 0x0123456789ABCDEF;
> -			else
> +				count_after = perf_test_timer_read();
> +			} else {
> +				count_before = perf_test_timer_read();
>  				READ_ONCE(*(uint64_t *)addr);
> +				count_after = perf_test_timer_read();

"count_before ... ACCESS count_after" could be moved to some macro,
e.g.,:
	t = MEASURE(READ_ONCE(*(uint64_t *)addr));

> +			}
> +
> +			maybe_sample = guest_random_u32(&rand_state) % (i + 1);
> +			if (i < SAMPLES_PER_VCPU)
> +				latency_samples_offset[i] = count_after - count_before;
> +			else if (maybe_sample < SAMPLES_PER_VCPU)
> +				latency_samples_offset[maybe_sample] = count_after - count_before;

I would prefer these reservoir sampling details to be in a helper, 
e.g.,:
	reservoir_sample_record(t, i);

>  		}
>  
>  		GUEST_SYNC(1);
> -- 
> 2.38.1.431.g37b22c650d-goog
>

next prev parent reply	other threads:[~2023-01-17 22:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15 17:32 [PATCH 0/3] Calculate memory access latency stats Colton Lewis
2022-11-15 17:32 ` [PATCH 1/3] KVM: selftests: Allocate additional space for latency samples Colton Lewis
2023-01-17 20:32   ` Ricardo Koller
2023-01-18 16:49   ` Sean Christopherson
2023-01-26 18:00     ` Colton Lewis
2022-11-15 17:32 ` [PATCH 2/3] KVM: selftests: Collect memory access " Colton Lewis
2023-01-17 20:43   ` Ricardo Koller [this message]
2023-01-18 16:32     ` Sean Christopherson
2023-01-26 18:00       ` Colton Lewis
2023-01-26 19:07         ` Sean Christopherson
2023-01-26 17:58     ` Colton Lewis
2023-01-26 18:30       ` Ricardo Koller
2023-01-17 20:48   ` Ricardo Koller
2023-01-26 17:59     ` Colton Lewis
2022-11-15 17:32 ` [PATCH 3/3] KVM: selftests: Print summary stats of memory latency distribution Colton Lewis
2023-01-17 20:45   ` Ricardo Koller
2023-01-26 17:58     ` Colton Lewis
2023-01-18 16:43   ` Sean Christopherson
2023-01-26 17:59     ` Colton Lewis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8cIdxp5k8HivVAe@google.com \
    --to=ricarkol@google.com \
    --cc=bgardon@google.com \
    --cc=coltonlewis@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.