From mboxrd@z Thu Jan  1 00:00:00 1970
From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Subject: Re: [PATCH 4/4] KVM: MMU: Make mmu_shrink() scan nr_to_scan shadow
 pages
Date: Mon, 19 Dec 2011 19:21:37 +0900
Message-ID: <4EEF1031.8090801@oss.ntt.co.jp>
References: <20111212072242.8aaf64a3420608b8204702c7@gmail.com> <20111212072647.1990b19483b0a482a894a0f6@gmail.com> <20111216110611.GC26982@amt.cnet> <20111216235824.a8016959785b2bd869b84a0a@gmail.com> <4EEEF92F.9020807@redhat.com> <4EEF0257.9090507@oss.ntt.co.jp> <4EEF0354.4090603@redhat.com> <4EEF0A35.9010009@oss.ntt.co.jp> <4EEF0C02.5000906@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>,
	Marcelo Tosatti <mtosatti@redhat.com>, kvm@vger.kernel.org
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from serv2.oss.ntt.co.jp ([222.151.198.100]:42447 "EHLO
	serv2.oss.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751840Ab1LSKU1 (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 19 Dec 2011 05:20:27 -0500
In-Reply-To: <4EEF0C02.5000906@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

(2011/12/19 19:03), Avi Kivity wrote:
>> IMO, The goal should be restricted to emergencies.
>>
>> So possible solution may be:
>>      - we set the tuning parameters as conservative as possible
>>      - pick up a guest with relatively high ratio
>>        (I have to think more how to achieve this)
>>      - move the vm_list head for fairness
>>
>> In an emergency, we should not mind performance penalty so much.
>
> But is the shrinker really only called in emergencies?

No, sadly.

That is the problem.

>
> Also, with things like cgroups, we may have an emergency in one
> container, but not in others - if the shrinker is not cgroup aware, it
> soon will be.

That seems to be a common problem for everyone, not KVM only.

>> But there is not a perfect value because how often mmu_shrink() can be
>> called
>> will change if the admin change the sysctl_vfs_cache_pressure tuning
>> parameter
>> for dcache and icache, IIUC.
>>
>> And tdp and shadow paging differ much.
>
> We should aim for the following:
> - normal operation causes very little shrinks (some are okay)
> - high pressure mostly due to kvm results in kvm being shrunk (this is a
> pathological case caused by a starting a guest with a huge amount of
> memory, and mapping it all to /dev/zero (or ksm), and getting the guest
> the create shadow mappings for all of it)
> - general high pressure is shared among other caches like dcache and icache
>
> The cost of reestablishing an mmu page can be as high as half a
> millisecond of cpu time, which is the reason I want to be conservative.
>

I agree with you.

I feel that I should add lkml in CC next time to hear from mm specialist.
Shrinker has many heuristics added from a lot of experience; my lack of
such experience means I need help.

	Takuya