public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Lukáš Doktor" <ldoktor@redhat.com>
To: dlaor@redhat.com
Cc: Jiri Zupka <jzupka@redhat.com>,
	autotest <autotest@test.kernel.org>, kvm <kvm@vger.kernel.org>
Subject: Re: [Autotest] [KVM-AUTOTEST] KSM-overcommit test v.2 (python version)
Date: Tue, 01 Dec 2009 13:23:23 +0100	[thread overview]
Message-ID: <4B150ABB.5030600@redhat.com> (raw)
In-Reply-To: <4B129E8A.5070402@redhat.com>

Dne 29.11.2009 17:17, Dor Laor napsal(a):
> On 11/26/2009 12:11 PM, Lukáš Doktor wrote:
>> Hello Dor,
>>
>> Thank you for your review. I have few questions about your comments:
>>
>> ----------- snip -----------
>>>> + stat += "Guests memsh = {"
>>>> + for vm in lvms:
>>>> + if vm.is_dead():
>>>> + logging.info("Trying to get informations of death VM: %s"
>>>> + % vm.name)
>>>> + continue
>>>
>>> You can fail the entire test. Afterwards it will be hard to find the
>>> issue.
>>>
>>
>> Well if it's what the community wants, we can change it. We just didn't
>> want to lose information about the rest of the systems. Perhaps we can
>> set some DIE flag and after collecting all statistics raise an Error.
>
> I don't think we need to continue testing if some thing as basic as VM
> died upon us.
OK, we are going to change this.

>
>>
>> ----------- snip -----------
>>>> + def get_true_pid(vm):
>>>> + pid = vm.process.get_pid()
>>>> + for i in range(1,10):
>>>> + pid = pid + 1
>>>
>>> What are you trying to do here? It's seems like a nasty hack that might
>>> fail on load.
>>>
>
>
> qemu has -pifile option. It works fine.
>
Oh my, I haven't thought on this. Of course I'm going to use -pidfile 
instead of this silly thing...

>>
>> Yes and I'm really sorry for this ugly hack. The qemu command has
>> changed since the first patch was made. Nowadays the "vm.pid" returns
>> PID of the command itself, not the actual qemu process.
>> We need to have the PID of the actual qemu process, which is executed by
>> the command with PID "vm.pid". That's why first I try finding the qemu
>> process as the following "vm.pid" PID. I haven't found another solution
>> yet (in case we don't want to change the qemu command back in the
>> framework).
>> We have tested this solution under heavy process load and either first
>> or second part always finds the right value.
>>
>> ----------- snip -----------
>>>> + if (params['ksm_test_size'] == "paralel") :
>>>> + vmsc = 1
>>>> + overcommit = 1
>>>> + mem = host_mem
>>>> + # 32bit system adjustment
>>>> + if not params['image_name'].endswith("64"):
>>>> + logging.debug("Probably i386 guest architecture, "\
>>>> + "max allocator mem = 2G")
>>>
>>> Better not to relay on the guest name. You can test percentage of the
>>> guest mem.
>>>
>>
>> What do you mean by "percentage of the guest mem"? This adjustment is
>> made because the maximum memory for 1 process in 32 bit OS is 2GB.
>> Testing of the 'image_name' showed to be most reliable method we found.
>
>
> It's not that important but it should be a convention of kvm autotest.
> If that's acceptable, fine, otherwise, each VM will define it in the
> config file
>
Yes kvm-autotest definitely need a way to decide whether this is 32 or 
64 bit guest. I'll send a separate email to KVM-autotest mailing list to 
let others express their opinions.

>>
>> ----------- snip -----------
>>>> + # Guest can have more than 2G but kvm mem + 1MB (allocator itself)
>>>> + # can't
>>>> + if (host_mem> 2048):
>>>> + mem = 2047
>>>> +
>>>> +
>>>> + if os.popen("uname -i").readline().startswith("i386"):
>>>> + logging.debug("Host is i386 architecture, max guest mem is 2G")
>>>
>>> There are bigger 32 bit guests.
>>>
>> How do you mean this note? We are testing whether the host machine is 32
>> bit. If so, the maximum process allocation is 2GB (similar case to 32
>> bit guest) but this time the whole qemu process (2GB qemu machine + 64
>> MB qemu overhead) can't exceeded 2GB.
>> Still the maximum memory used in test is the same (as we increase the VM
>> count - host_mem = quest_mem * vm_count; quest_mem is decreased,
>> vm_count is increased)
>
> i386 guests with PAE mode (additional 4 bits) can have up to 16G ram on
> theory.
>
OK so we should first check whether PAE is on and separate into 3 groups 
(64bit-unlimited, PAE-16G, 32bit-2G).

>>
>> ----------- snip -----------
>>>> +
>>>> + # Copy the allocator.c into guests
>>>
>>> .py
>>>
>> yes indeed.
>>
>> ----------- snip -----------
>>>> + # Let kksmd works (until shared mem rich expected value)
>>>> + shm = 0
>>>> + i = 0
>>>> + cmd = "cat/proc/%d/statm" % get_true_pid(vm)
>>>> + while shm< ksm_size:
>>>> + if i> 64:
>>>> + logging.info(get_stat(lvms))
>>>> + raise error.TestError("SHM didn't merged the memory until "\
>>>> + "the DL on guest: %s"% (vm.name))
>>>> + logging.debug("Sleep(%d)" % (ksm_size / 200 * perf_ratio))
>>>> + time.sleep(ksm_size / 200 * perf_ratio)
>>>> + try:
>>>> + shm = int(os.popen(cmd).readline().split()[2])
>>>> + shm = shm * 4 / 1024
>>>> + i = i + 1
>>>
>>> Either you have nice statistic calculation function or not.
>>> I vote for the first case.
>>>
>>
>> Yes, we are using the statistics function for the output. But in this
>> case we just need to know the shm value, not to log anything.
>> If this is a big problem even for others, we can split the statistics
>> function into 2:
>> int = _get_stat(vm) - returns shm value
>> string = get_stat(vm) - Uses _get_stats and creates a nice log output
>>
>> ----------- snip -----------
>>>> + """ Check if memory in max loading guest is allright"""
>>>> + logging.info("Starting phase 3b")
>>>> +
>>>> + """ Kill rest of machine"""
>>>
>>> We should have a function for it for all kvm autotest
>>>
>>
>> you think lsessions[i].close() instead of (status,data) =
>> lsessions[i].get_command_status_output("exit;",20)?
>> Yes, it would be better.
>>
>>>> + for i in range(last_vm+1, vmsc):
>>>> + (status,data) = lsessions[i].get_command_status_output("exit;",20)
>>>> + if i == (vmsc-1):
>>>> + logging.info(get_stat([lvms[i]]))
>>>> + lvms[i].destroy(gracefully = False)
>>
>> ----------- snip -----------
>>>> + def phase_paralel():
>>>> + """ Paralel page spliting """
>>>> + logging.info("Phase 1: Paralel page spliting")
>>>> + # We have to wait until allocator is finished (it waits 5 seconds to
>>>> + # clean the socket
>>>> +
>>>
>>> The whole function is very similar to phase_separate_first_guest please
>>> refactor them.
>>>
>> Yes, those functions are a bit similar. On the other hand there are some
>> differences. Instead of creating more complex function we agreed to
>> split them for better readability of the code.
>
> The trouble is that re-implementing it make the code longer and more
> error prone
>
OK, we'll create only one function.

>>
>> ----------- snip -----------
>>>> + while shm< ksm_size:
>>>> + if i> 64:
>>>> + logging.info(get_stat(lvms))
>>>> + raise error.TestError("SHM didn't merged the memory until DL")
>>>> + logging.debug("Sleep(%d)" % (ksm_size / 200 * perf_ratio))
>>>> + time.sleep(ksm_size / 200 * perf_ratio)
>>>> + try:
>>>> + shm = int(os.popen(cmd).readline().split()[2])
>>>> + shm = shm * 4 / 1024
>>>> + except:
>>>> + raise error.TestError("Could not fetch shmem info from proc")
>>>
>>> Didn't you needed to increase i?
>>>
>> yes, you are right. This line somehow disappeard "i = i + 1"...
>
> Does it pass all the tests?
>
Not incrementing the "i" could lead to infinite loop but only when KSM 
fails. That's why we haven't found the mistake by testing... Well fix 
this altogether with other modifications.

>>
>> ----------- snip -----------
>>>> + def compare_page(self,original,inmem):
>>>> + """
>>>> + compare memory
>>>> +
>>>
>>>
>>> Why do you need it? Is it to really check ksm didn't do damage?
>>> Interesting, I never doubted ksm for that. Actually it is good idea to
>>> test...
>>>
>>
>> We were asked to do so (be paranoid, everything could happened). We can
>> make this voluntary in the config.
>
> I guess it just add sanity, better leave it on always, I guess it's not
> that time consuming.
>
+1
(Actually using python this is kind of time consuming. But compare to 
the whole test duration it's nothing.)

>>
>>
>> Once again thanks, I'm looking forward to your replay.
>>
>> Best regards,
>> Lukáš Doktor
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


  reply	other threads:[~2009-12-01 12:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1979665368.132551258469274615.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2009-11-17 14:49 ` [Autotest] [KVM-AUTOTEST] KSM-overcommit test v.2 (python version) Jiri Zupka
2009-11-22 16:06   ` Dor Laor
2009-11-26 10:11     ` Lukáš Doktor
2009-11-29 16:17       ` Dor Laor
2009-12-01 12:23         ` Lukáš Doktor [this message]
2009-12-18 15:04           ` Lukáš Doktor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B150ABB.5030600@redhat.com \
    --to=ldoktor@redhat.com \
    --cc=autotest@test.kernel.org \
    --cc=dlaor@redhat.com \
    --cc=jzupka@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox