From: Avi Kivity <avi@redhat.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: ngupta@vflare.org, Jeremy Fitzhardinge <jeremy@goop.org>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Pavel Machek <pavel@ucw.cz>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
hugh.dickins@tiscali.co.uk, JBeulich@novell.com,
chris.mason@oracle.com, kurt.hackel@oracle.com,
dave.mccracken@oracle.com, npiggin@suse.de,
akpm@linux-foundation.org, riel@redhat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
Date: Mon, 03 May 2010 12:39:20 +0300 [thread overview]
Message-ID: <4BDE99C8.1090002@redhat.com> (raw)
In-Reply-To: <b6cfd097-1003-47ce-9f1c-278835ba52d2@default>
On 05/02/2010 08:22 PM, Dan Magenheimer wrote:
>> It's bad, but it's better than ooming.
>>
>> The same thing happens with vcpus: you run 10 guests on one core, if
>> they all wake up, your cpu is suddenly 10x slower and has 30000x
>> interrupt latency (30ms vs 1us, assuming 3ms timeslices). Your disks
>> become slower as well.
>>
>> It's worse with memory, so you try to swap as a last resort. However,
>> swap is still faster than a crashed guest.
>>
> Your analogy only holds when the host administrator is either
> extremely greedy or stupid.
10x vcpu is reasonable in some situations (VDI, powersave at night).
Even a 2x vcpu overcommit will cause a 10000x interrupt latency degradation.
> My analogy only requires some
> statistical bad luck: Multiple guests with peaks and valleys
> of memory requirements happen to have their peaks align.
>
Not sure I understand.
>>> Third, host swapping makes live migration much more difficult.
>>> Either the host swap disk must be accessible to all machines
>>> or data sitting on a local disk MUST be migrated along with
>>> RAM (which is not impossible but complicates live migration
>>> substantially).
>>>
>> kvm does live migration with swapping, and has no special code to
>> integrate them.
>> :
>> Don't know about vmware, but kvm supports page sharing, swapping, and
>> live migration simultaneously.
>>
> Hmmm... I'll bet I can break it pretty easily. I think the
> case you raised that you thought would cause host OOM'ing
> will cause kvm live migration to fail.
>
> Or maybe not... when a guest is in the middle of a live migration,
> I believe (in Xen), the entire guest memory allocation (possibly
> excluding ballooned-out pages) must be simultaneously in RAM briefly
> in BOTH the host and target machine. That is, live migration is
> not "pipelined". Is this also true of KVM?
No. The entire guest address space can be swapped out on the source and
target, less the pages being copied to or from the wire, and pages
actively accessed by the guest. Of course performance will suck if all
memory is swapped out.
> If so, your
> statement above is just waiting a corner case to break it.
> And if not, I expect you've got fault tolerance issues.
>
Not that I'm aware of.
>>> If you talk to VMware customers (especially web-hosting services)
>>> that have attempted to use overcommit technologies that require
>>> host-swapping, you will find that they quickly become allergic
>>> to memory overcommit and turn it off. The end users (users of
>>> the VMs that inexplicably grind to a halt) complain loudly.
>>> As a result, RAM has become a bottleneck in many many systems,
>>> which ultimately reduces the utility of servers and the value
>>> of virtualization.
>>>
>> Choosing the correct overcommit ratio is certainly not an easy task.
>> However, just hoping that memory will be available when you need it is
>> not a good solution.
>>
> Choosing the _optimal_ overcommit ratio is impossible without a
> prescient knowledge of the workload in each guest. Hoping memory
> will be available is certainly not a good solution, but if memory
> is not available guest swapping is much better than host swapping.
>
You cannot rely on guest swapping.
> And making RAM usage as dynamic as possible and live migration
> as easy as possible are keys to maximizing the benefits (and
> limiting the problems) of virtualization.
>
That is why you need overcommit. You make things dynamic with page
sharing and ballooning and live migration, but at some point you need a
failsafe fallback. The only failsafe fallback I can see (where the host
doesn't rely on guests) is swapping.
As far as I can tell, frontswap+tmem increases the problem. You loan
the guest some memory without the means to take it back, this increases
memory pressure on the host. The result is that if you want to avoid
swapping (or are unable to) you need to undercommit host resources.
Instead of sum(guest mem) + reserve < (host mem), you need sum(guest mem
+ committed tmem) + reserve < (host mem). You need more host memory, or
less guests, or to be prepared to swap if the worst happens.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-05-03 9:39 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-22 13:42 Frontswap [PATCH 0/4] (was Transcendent Memory): overview Dan Magenheimer
2010-04-22 15:28 ` Avi Kivity
2010-04-22 15:48 ` Dan Magenheimer
2010-04-22 16:13 ` Avi Kivity
2010-04-22 20:15 ` Dan Magenheimer
2010-04-23 9:48 ` Avi Kivity
2010-04-23 13:47 ` Dan Magenheimer
2010-04-23 13:57 ` Avi Kivity
2010-04-23 14:43 ` Dan Magenheimer
2010-04-23 14:52 ` Avi Kivity
2010-04-23 15:00 ` Avi Kivity
2010-04-23 16:26 ` Dan Magenheimer
2010-04-24 18:25 ` Avi Kivity
[not found] ` <1c02a94a-a6aa-4cbb-a2e6-9d4647760e91@default4BD43033.7090706@redhat.com>
2010-04-25 0:41 ` Dan Magenheimer
2010-04-25 12:06 ` Avi Kivity
2010-04-25 13:12 ` Dan Magenheimer
2010-04-25 13:18 ` Avi Kivity
2010-04-28 5:55 ` Pavel Machek
2010-04-29 14:42 ` Dan Magenheimer
2010-04-29 18:59 ` Avi Kivity
2010-04-29 19:01 ` Avi Kivity
2010-04-29 18:53 ` Avi Kivity
2010-04-30 1:45 ` Dave Hansen
2010-04-30 7:13 ` Avi Kivity
2010-04-30 15:59 ` Dan Magenheimer
2010-04-30 16:08 ` Dave Hansen
2010-05-10 16:05 ` Martin Schwidefsky
2010-04-30 16:16 ` Avi Kivity
[not found] ` <4BDB18CE.2090608@goop.org4BDB2069.4000507@redhat.com>
[not found] ` <3a62a058-7976-48d7-acd2-8c6a8312f10f@default20100502071059.GF1790@ucw.cz>
2010-04-30 16:43 ` Dan Magenheimer
2010-04-30 17:10 ` Dave Hansen
2010-04-30 18:08 ` Avi Kivity
2010-04-30 17:52 ` Jeremy Fitzhardinge
2010-04-30 18:24 ` Avi Kivity
2010-04-30 18:59 ` Jeremy Fitzhardinge
2010-05-01 8:28 ` Avi Kivity
2010-05-01 17:10 ` Dan Magenheimer
2010-05-02 7:11 ` Pavel Machek
2010-05-02 15:05 ` Dan Magenheimer
2010-05-02 20:06 ` Pavel Machek
2010-05-02 21:05 ` Dan Magenheimer
2010-05-02 7:57 ` Nitin Gupta
2010-05-02 16:06 ` Dan Magenheimer
2010-05-02 16:48 ` Avi Kivity
2010-05-02 17:22 ` Dan Magenheimer
2010-05-03 9:39 ` Avi Kivity [this message]
2010-05-03 14:59 ` Dan Magenheimer
2010-05-02 15:35 ` Avi Kivity
2010-05-02 17:06 ` Dan Magenheimer
2010-05-03 8:46 ` Avi Kivity
2010-05-03 16:01 ` Dan Magenheimer
2010-05-03 19:32 ` Pavel Machek
2010-04-30 16:04 ` Dave Hansen
2010-04-23 15:56 ` Dan Magenheimer
2010-04-24 18:22 ` Avi Kivity
2010-04-25 0:30 ` Dan Magenheimer
2010-04-25 12:11 ` Avi Kivity
[not found] ` <c5062f3a-3232-4b21-b032-2ee1f2485ff0@default4BD44E74.2020506@redhat.com>
2010-04-25 13:37 ` Dan Magenheimer
2010-04-25 14:15 ` Avi Kivity
2010-04-25 15:29 ` Dan Magenheimer
2010-04-26 6:01 ` Avi Kivity
2010-04-26 12:45 ` Dan Magenheimer
2010-04-26 13:48 ` Avi Kivity
2010-04-27 12:56 ` Pavel Machek
2010-04-27 14:32 ` Dan Magenheimer
2010-04-29 13:02 ` Pavel Machek
2010-04-27 11:52 ` Valdis.Kletnieks
2010-04-27 0:49 ` Jeremy Fitzhardinge
2010-04-27 12:55 ` Pavel Machek
2010-04-27 14:43 ` Nitin Gupta
2010-04-29 13:04 ` Pavel Machek
2010-04-24 1:49 ` Nitin Gupta
2010-04-24 18:27 ` Avi Kivity
2010-04-25 3:11 ` Nitin Gupta
2010-04-25 12:16 ` Avi Kivity
2010-04-25 16:05 ` Nitin Gupta
2010-04-26 6:06 ` Avi Kivity
2010-04-26 12:50 ` Dan Magenheimer
2010-04-26 13:43 ` Avi Kivity
2010-04-27 8:29 ` Dan Magenheimer
2010-04-27 9:21 ` Avi Kivity
2010-04-26 13:47 ` Nitin Gupta
2010-04-23 16:35 ` Jiahua
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BDE99C8.1090002@redhat.com \
--to=avi@redhat.com \
--cc=JBeulich@novell.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=dan.magenheimer@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=dave@linux.vnet.ibm.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=jeremy@goop.org \
--cc=kurt.hackel@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ngupta@vflare.org \
--cc=npiggin@suse.de \
--cc=pavel@ucw.cz \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).