All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.xen@gmail.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>,
	Jan Beulich <JBeulich@novell.com>
Cc: Olaf Hering <olaf@aepfle.de>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	George Shuklin <george.shuklin@gmail.com>,
	"Tim (Xen.org)" <tim@xen.org>,
	xen-devel@lists.xen.org, Dario Faggioli <raistlin@linux.it>,
	Kurt Hackel <kurt.hackel@oracle.com>,
	Zhigang Wang <zhigang.x.wang@oracle.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: Re: Proposed new "memory capacity claim" hypercall/feature
Date: Tue, 30 Oct 2012 00:17:12 +0100	[thread overview]
Message-ID: <CCB4CD08.42BC0%keir.xen@gmail.com> (raw)
In-Reply-To: <806ef32a-2ded-42b5-a2ff-5f84ac1c1d47@default>

On 30/10/2012 00:03, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

>> From: Keir Fraser [mailto:keir@xen.org]
>> Subject: Re: Proposed new "memory capacity claim" hypercall/feature
>> 
>> On 29/10/2012 21:08, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
>> 
>> Well it does depend how scalable domain creation actually is as an
>> operation. If it is spending most of its time allocating memory then it is
>> quite likely that parallel creations will spend a lot of time competing for
>> the heap spinlock, and actually there will be little/no speedup compared
>> with serialising the creations. Further, if domain creation can take
>> minutes, it may be that we simply need to go optimise that -- we already
>> found one stupid thing in the heap allocator recently that was burining
>> loads of time during large-memory domain creations, and fixed it for a
>> massive speedup in that particular case.
> 
> I suppose ultimately it is a scalability question.  But Oracle's
> measure of success here is based on how long a human or a tool
> has to wait for confirmation to ensure that a domain will
> successfully launch.  If two domains are launched in parallel
> AND an indication is given that both will succeed, spinning on
> the heaplock a bit just makes for a longer "boot" time, which is
> just a cost of virtualization.  If they are launched in parallel
> and, minutes later (or maybe even 20 seconds later), one or
> both say "oops, I was wrong, there wasn't enough memory, so
> try again", that's not OK for data center operations, especially if
> there really was enough RAM for one, but not for both. Remember,
> in the Oracle environment, we are talking about an administrator/automation
> overseeing possibly hundreds of physical servers, not just a single
> user/server.
> 
> Does that make more sense?

Yes, that makes sense.

> The "claim" approach immediately guarantees success or failure.
> Unless there are enough "stupid things/optimisations" found that
> you would be comfortable putting memory allocation for a domain
> creation in a hypervisor spinlock, there will be a race unless
> an atomic mechanism exists such as "claiming" where
> only simple arithmetic must be done within a hypervisor lock.
> 
> Do you disagree?
> 
>>> and (2) tmem and/or other dynamic
>>> memory mechanisms may be asynchronously absorbing small-but-significant
>>> portions of RAM for other purposes during an attempted domain launch.
>> 
>> This is an argument against allocate-rather-than-reserve? I don't think that
>> makes sense -- so is this instead an argument against
>> reservation-as-a-toolstack-only-mechanism? I'm not actually convinced yet we
>> need reservations *at all*, before we get down to where it should be
>> implemented.
> 
> I'm not sure if we are defining terms the same, so that's hard
> to answer.  If you define "allocation" as "a physical RAM page frame
> number is selected (and possibly the physical page is zeroed)",
> then I'm not sure how your definition of "reservation" differs
> (because that's how increase/decrease_reservation are implemented
> in the hypervisor, right?).
> 
> Or did you mean "allocate-rather-than-claim" (where "allocate" is
> select a specific physical pageframe and "claim" means do accounting
> only?  If so, see the atomicity argument above.
> 
> I'm not just arguing against reservation-as-a-toolstack-mechanism,
> I'm stating I believe unequivocally that reservation-as-a-toolstack-
> only-mechanism and tmem are incompatible.  (Well, not _totally_
> incompatible... the existing workaround, tmem freeze/thaw, works
> but is also single-threaded and has fairly severe unnecessary
> performance repercussions.  So I'd like to solve both problems
> at the same time.)

Okay, so why is tmem incompatible with implementing claims in the toolstack?

 -- Keir

> Dan

  reply	other threads:[~2012-10-29 23:17 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-29 17:06 Proposed new "memory capacity claim" hypercall/feature Dan Magenheimer
2012-10-29 18:24 ` Keir Fraser
2012-10-29 21:08   ` Dan Magenheimer
2012-10-29 22:22     ` Keir Fraser
2012-10-29 23:03       ` Dan Magenheimer
2012-10-29 23:17         ` Keir Fraser [this message]
2012-10-30 15:13           ` Dan Magenheimer
2012-10-30 14:43             ` Keir Fraser
2012-10-30 16:33               ` Dan Magenheimer
2012-10-30  9:11         ` George Dunlap
2012-10-30 16:13           ` Dan Magenheimer
2012-10-29 22:35 ` Tim Deegan
2012-10-29 23:21   ` Dan Magenheimer
2012-10-30  8:13     ` Tim Deegan
2012-10-30 15:26       ` Dan Magenheimer
2012-10-30  8:29     ` Jan Beulich
2012-10-30 15:43       ` Dan Magenheimer
2012-10-30 16:04         ` Jan Beulich
2012-10-30 17:13           ` Dan Magenheimer
2012-10-31  8:14             ` Jan Beulich
2012-10-31 16:04               ` Dan Magenheimer
2012-10-31 16:19                 ` Jan Beulich
2012-10-31 16:51                   ` Dan Magenheimer
2012-11-02  9:01                     ` Jan Beulich
2012-11-02  9:30                       ` Keir Fraser
2012-11-04 19:43                         ` Dan Magenheimer
2012-11-04 20:35                           ` Tim Deegan
2012-11-05  0:23                             ` Dan Magenheimer
2012-11-05 10:29                               ` Ian Campbell
2012-11-05 14:54                                 ` Dan Magenheimer
2012-11-05 22:24                                   ` Ian Campbell
2012-11-05 22:58                                     ` Zhigang Wang
2012-11-05 22:58                                     ` Dan Magenheimer
2012-11-06 13:23                                       ` Ian Campbell
2012-11-05 22:33                             ` Dan Magenheimer
2012-11-06 10:49                               ` Jan Beulich
2012-11-05  9:16                           ` Jan Beulich
2012-11-07 22:17                             ` Dan Magenheimer
2012-11-08  7:36                               ` Keir Fraser
2012-11-08 10:11                                 ` Ian Jackson
2012-11-08 10:57                                   ` Keir Fraser
2012-11-08 21:45                                   ` Dan Magenheimer
2012-11-12 11:03                                     ` Ian Jackson
2012-11-08  8:00                               ` Jan Beulich
2012-11-08  8:18                                 ` Keir Fraser
2012-11-08  8:54                                   ` Jan Beulich
2012-11-08  9:12                                     ` Keir Fraser
2012-11-08  9:47                                       ` Jan Beulich
2012-11-08 10:50                                         ` Keir Fraser
2012-11-08 13:48                                           ` Jan Beulich
2012-11-08 19:16                                             ` Dan Magenheimer
2012-11-08 22:32                                               ` Keir Fraser
2012-11-09  8:47                                               ` Jan Beulich
2012-11-08 18:38                                 ` Dan Magenheimer
2012-11-05 17:14         ` George Dunlap
2012-11-05 18:21           ` Dan Magenheimer
2012-11-01  2:13   ` Dario Faggioli
2012-11-01 15:51     ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CCB4CD08.42BC0%keir.xen@gmail.com \
    --to=keir.xen@gmail.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@novell.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=george.shuklin@gmail.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=olaf@aepfle.de \
    --cc=raistlin@linux.it \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=zhigang.x.wang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.