Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: Andrew Jones <drjones@redhat.com>
Cc: dlaor@redhat.com, Chegu Vinod <chegu_vinod@hp.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Ingo Molnar <mingo@redhat.com>, Avi Kivity <avi@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
	KVM <kvm@vger.kernel.org>, Jiannan Ouyang <ouyang@cs.pitt.edu>,
	"Andrew M. Theurer" <habanero@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
	Gleb Natapov <gleb@redhat.com>
Subject: Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler
Date: Thu, 27 Sep 2012 17:01:30 +0530	[thread overview]
Message-ID: <50643912.8090408@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120927102832.GA4106@turtle.usersys.redhat.com>

On 09/27/2012 03:58 PM, Andrew Jones wrote:
> On Thu, Sep 27, 2012 at 03:19:45PM +0530, Raghavendra K T wrote:
>> On 09/25/2012 08:30 PM, Dor Laor wrote:
>>> On 09/24/2012 02:02 PM, Raghavendra K T wrote:
>>>> On 09/24/2012 02:12 PM, Dor Laor wrote:
>>>>> In order to help PLE and pvticketlock converge I thought that a small
>>>>> test code should be developed to test this in a predictable,
>>>>> deterministic way.
>>>>>
>>>>> The idea is to have a guest kernel module that spawn a new thread each
>>>>> time you write to a /sys/.... entry.
>>>>>
>>>>> Each such a thread spins over a spin lock. The specific spin lock is
>>>>> also chosen by the /sys/ interface. Let's say we have an array of spin
>>>>> locks *10 times the amount of vcpus.
>>>>>
>>>>> All the threads are running a
>>>>> while (1) {
>>>>>
>>>>> spin_lock(my_lock);
>>>>> sum += execute_dummy_cpu_computation(time);
>>>>> spin_unlock(my_lock);
>>>>>
>>>>> if (sys_tells_thread_to_die()) break;
>>>>> }
>>>>>
>>>>> print_result(sum);
>>>>>
>>>>> Instead of calling the kernel's spin_lock functions, clone them and make
>>>>> the ticket lock order deterministic and known (like a linear walk of all
>>>>> the threads trying to catch that lock).
>>>>
>>>> By Cloning you mean hierarchy of the locks?
>>>
>>> No, I meant to clone the implementation of the current spin lock code in
>>> order to set any order you may like for the ticket selection.
>>> (even for a non pvticket lock version)
>>>
>>> For instance, let's say you have N threads trying to grab the lock, you
>>> can always make the ticket go linearly from 1->2...->N.
>>> Not sure it's a good idea, just a recommendation.
>>>
>>>> Also I believe time should be passed via sysfs / hardcoded for each
>>>> type of lock we are mimicking
>>>
>>> Yap
>>>
>>>>
>>>>>
>>>>> This way you can easy calculate:
>>>>> 1. the score of a single vcpu running a single thread
>>>>> 2. the score of sum of all thread scores when #thread==#vcpu all
>>>>> taking the same spin lock. The overall sum should be close as
>>>>> possible to #1.
>>>>> 3. Like #2 but #threads > #vcpus and other versions of #total vcpus
>>>>> (belonging to all VMs) > #pcpus.
>>>>> 4. Create #thread == #vcpus but let each thread have it's own spin
>>>>> lock
>>>>> 5. Like 4 + 2
>>>>>
>>>>> Hopefully this way will allows you to judge and evaluate the exact
>>>>> overhead of scheduling VMs and threads since you have the ideal result
>>>>> in hand and you know what the threads are doing.
>>>>>
>>>>> My 2 cents, Dor
>>>>>
>>>>
>>>> Thank you,
>>>> I think this is an excellent idea. ( Though I am trying to put all the
>>>> pieces together you mentioned). So overall we should be able to measure
>>>> the performance of pvspinlock/PLE improvements with a deterministic
>>>> load in guest.
>>>>
>>>> Only thing I am missing is,
>>>> How to generate different combinations of the lock.
>>>>
>>>> Okay, let me see if I can come with a solid model for this.
>>>>
>>>
>>> Do you mean the various options for PLE/pvticket/other? I haven't
>>> thought of it and assumed its static but it can also be controlled
>>> through the temporary /sys interface.
>>>
>>
>> No, I am not there yet.
>>
>> So In summary, we are suffering with inconsistent benchmark result,
>> while measuring the benefit of our improvement in PLE/pvlock etc..
>
> Are you measuring the combined throughput of all running guests, or
> just looking at the results of the benchmarks in a single test guest?
>
> I've done some benchmarking as well and my stddevs look pretty good for
> kcbench, ebizzy, dbench, and sysbench-memory. I do 5 runs for each
> overcommit level (1.0 - 3.0, stepped by .25 or .5), and 2 runs of that
> full sequence of tests (one with the overcommit levels in scrambled
> order). The relative stddevs for each of the sets of 5 runs look pretty
> good, and the data for the 2 runs match nicely as well.
>
> To try and get consistent results I do the following
> - interleave the memory of all guests across all numa nodes on the
>    machine
> - echo 0 > /proc/sys/kernel/randomize_va_space on both host and test
>    guest

I was not doing this.

> - echo 3 > /proc/sys/vm/drop_caches on both host and test guest before
>    each run

was doing already as you know

> - use a ramdisk for the benchmark output files on all running guests

Yes.. this is also helpful

> - no periodically running services installed on the test guest
> - HT is turned off as you do, although I'd like to try running again
>    with it turned back on
> Although, I still need to run again measuring the combined throughput
> of all running vms (including the ones launched just to generate busy
> vcpus). Maybe my results won't be as consistent then...

May be. I take average from all the VMs..

>
> Drew
>
>>
>> So good point from your suggestion is,
>> - Giving predictability to workload that runs in guest, so that we have
>> pi-pi comparison of improvement.
>>
>> - we can easily tune the workload via sysfs, and we can have script to
>> automate them.
>>
>> What is complicated is:
>> - How can we simulate a workload close to what we measure with
>> benchmarks?
>> - How can we mimic lock holding time/ lock hierarchy close to the way
>> it is seen with real workloads (for e.g. highly contended zone lru lock
>> with similar amount of lockholding times).
>> - How close it would be to when we forget about other types of spinning
>> (for e.g, flush_tlb).
>>
>> So I feel it is not as trivial as it looks like.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>

next prev parent reply	other threads:[~2012-09-27 11:35 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-21 11:59 [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 1/2] kvm: Handle undercommitted guest case " Raghavendra K T
2012-09-21 13:02   ` Rik van Riel
2012-09-21 17:24     ` Raghavendra K T
2012-09-24 15:41       ` Avi Kivity
2012-09-24 16:06         ` Avi Kivity
2012-09-24 16:14           ` Peter Zijlstra
2012-09-24 16:25             ` Avi Kivity
2012-09-25  8:09           ` Raghavendra K T
2012-09-25  8:54             ` Avi Kivity
2012-09-25 13:49               ` Raghavendra K T
2012-09-27  7:44               ` Gleb Natapov
2012-09-27  8:59                 ` Avi Kivity
2012-09-27  9:11                   ` Gleb Natapov
2012-09-27  9:33                     ` Avi Kivity
2012-09-27  9:58                       ` Gleb Natapov
2012-09-27 10:04                         ` Avi Kivity
2012-09-27 10:08                           ` Gleb Natapov
2012-09-27 10:15                             ` Avi Kivity
     [not found]               ` <CAJocwcf+8u84_yDC-PK0Yni93YSTWzYvr69nq6b3pNv1MwVJzQ@mail.gmail.com>
2012-09-27  8:50                 ` Avi Kivity
2012-09-27 11:26                   ` Raghavendra K T
2012-09-27 12:06                     ` Avi Kivity
2012-09-28 18:18                       ` Konrad Rzeszutek Wilk
2012-09-30  8:16                         ` Avi Kivity
     [not found]                   ` <CAJocwcc19F+PtsQ5okGMvYeVnkEigpZRpwWY9JgeRPFqfcVoXA@mail.gmail.com>
2012-09-28  6:16                     ` Raghavendra K T
2012-09-30  8:18                       ` Avi Kivity
2012-09-30 11:07                         ` Gleb Natapov
2012-09-30 11:13                           ` Avi Kivity
2012-10-03 14:17                             ` Raghavendra K T
2012-10-03 14:56                               ` Avi Kivity
2012-10-04  7:29                                 ` Gleb Natapov
2012-10-05  8:36                                   ` Raghavendra K T
2012-10-07  9:51                                     ` Avi Kivity
2012-09-25  7:36         ` Raghavendra K T
2012-09-25  8:12           ` Avi Kivity
2012-09-25 14:21             ` Takuya Yoshikawa
2012-09-27  8:43               ` Avi Kivity
2012-10-03 12:22         ` Raghavendra K T
2012-10-03 17:05           ` Avi Kivity
2012-10-04 10:49             ` Raghavendra K T
2012-10-04 12:41               ` Avi Kivity
2012-10-04 13:07                 ` Peter Zijlstra
2012-10-04 15:00                   ` Avi Kivity
2012-10-09 18:51                     ` Raghavendra K T
2012-10-10  2:59                       ` Andrew Theurer
2012-10-10 17:54                         ` Raghavendra K T
2012-10-10 18:03                           ` David Ahern
2012-10-10 18:14                             ` Raghavendra K T
2012-10-10 19:36                           ` Andrew Theurer
2012-10-15 12:10                             ` Raghavendra K T
2012-10-15 14:34                               ` Andrew Theurer
2012-10-19  8:30                                 ` Raghavendra K T
2012-10-19 13:31                                   ` Andrew Theurer
2012-10-10 14:24                       ` Andrew Theurer
2012-10-10 17:43                         ` Raghavendra K T
2012-10-10 19:27                           ` Andrew Theurer
2012-10-11 17:13                             ` Raghavendra K T
2012-10-11 10:39                         ` Nikunj A Dadhania
2012-10-18 12:39                       ` Avi Kivity
2012-10-19  8:19                         ` Raghavendra K T
2012-10-04 14:41                 ` Andrew Theurer
2012-10-05  9:06                   ` Raghavendra K T
2012-10-05  9:02                 ` Raghavendra K T
2012-09-24 11:33   ` Peter Zijlstra
2012-09-24 11:40     ` Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario " Raghavendra K T
2012-09-21 13:22   ` Rik van Riel
2012-09-21 13:46   ` Takuya Yoshikawa
2012-09-21 13:52     ` Rik van Riel
2012-09-21 17:45       ` Raghavendra K T
2012-09-24 13:43         ` Takuya Yoshikawa
2012-09-24 15:26   ` Avi Kivity
2012-09-24 15:34     ` Peter Zijlstra
2012-09-24 15:43       ` Avi Kivity
2012-09-24 15:52         ` Peter Zijlstra
2012-09-24 15:58           ` Avi Kivity
2012-09-24 16:05             ` Peter Zijlstra
2012-09-24 16:10               ` Avi Kivity
2012-09-24 16:13                 ` Peter Zijlstra
2012-09-24 16:21                   ` Avi Kivity
2012-09-25 10:11                     ` Avi Kivity
2012-09-21 13:18 ` [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios " Chegu Vinod
2012-09-21 17:36   ` Raghavendra K T
2012-09-24  8:42     ` Dor Laor
2012-09-24 12:02       ` Raghavendra K T
2012-09-25 15:00         ` Dor Laor
2012-09-26 12:27           ` Konrad Rzeszutek Wilk
2012-09-27 10:07             ` Raghavendra K T
2012-09-27  9:49           ` Raghavendra K T
2012-09-27 10:28             ` Andrew Jones
2012-09-27 10:44               ` Avi Kivity
2012-09-27 11:31               ` Raghavendra K T [this message]
2012-09-27 10:33             ` Dor Laor
2012-09-24 11:34 ` Peter Zijlstra
2012-09-24 11:52   ` Raghavendra K T
2012-09-24 12:36     ` Peter Zijlstra
2012-09-24 13:29       ` Raghavendra K T
2012-09-24 13:54         ` Peter Zijlstra
2012-09-24 14:16           ` Raghavendra K T
2012-09-25 13:40             ` Raghavendra K T
2012-09-27  8:36               ` Avi Kivity
2012-09-27 11:23                 ` Raghavendra K T
2012-09-27 12:03                   ` Avi Kivity
2012-09-27 12:25                     ` Andrew Theurer
2012-09-28  5:38                     ` Raghavendra K T
2012-09-28  5:45                       ` H. Peter Anvin
2012-09-28  6:03                         ` Raghavendra K T
2012-09-28  8:38                       ` Peter Zijlstra
2012-09-28 11:40                       ` Andrew Theurer
2012-09-28 14:11                         ` Raghavendra K T
2012-09-28 14:13                         ` Peter Zijlstra
2012-09-30  8:24                         ` Avi Kivity
2012-10-03 14:29                     ` Raghavendra K T
2012-10-03 17:25                       ` Avi Kivity
2012-10-04 10:56                         ` Raghavendra K T
2012-10-04 12:44                           ` Avi Kivity
2012-10-05  9:04                             ` Raghavendra K T
2012-09-24 15:51           ` Avi Kivity
2012-09-24 16:03             ` Peter Zijlstra
2012-09-24 16:20               ` Avi Kivity
2012-09-26 13:20                 ` Andrew Jones
2012-09-26 13:26                   ` Peter Zijlstra
2012-09-26 13:39                     ` Andrew Jones
2012-09-26 13:45                       ` Peter Zijlstra
2012-09-26 12:57       ` Andrew Jones
2012-09-27 10:21         ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50643912.8090408@linux.vnet.ibm.com \
    --to=raghavendra.kt@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=chegu_vinod@hp.com \
    --cc=dlaor@redhat.com \
    --cc=drjones@redhat.com \
    --cc=gleb@redhat.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).