xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Request for input: Extended event channel support
@ 2013-03-27 11:23 George Dunlap
  2013-03-27 19:36 ` Anil Madhavapeddy
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: George Dunlap @ 2013-03-27 11:23 UTC (permalink / raw)
  To: xen-devel@lists.xen.org, xen-users@lists.xen.org

* Executive summary

The number of event channels available for dom0 is currently one of
the biggest limitations on scaling up the number of VMs which can be
created on a single system.  There are two alternative implementations
we could choose, one of which is ready now, the other of which is
potentially technically superior, but will not be ready for the 4.3
release.

The core question we need to ask the community: How important is
lifting the event channel scalability limit to 4.3?  Will waiting
until 4.4 cause a limit in the uptake of the Xen platform?

* The issue

The existing event channel implementation for PV guests is implemented
as 2-level bit array.  This limits the total number of event channels
to word_size ^ 2, which is 1024 for 32-bit guests and 4096 for 64-bit
guests.

This sounds like a lot, until you consider that in a typical system,
each VM needs 4 or more event channels in domain 0.  This means that
for a 32-bit dom0, there is a theoretical maximum of 256 guests -- and
in practice it's more like 180 or so, because of event channels
required for other things.  XenServer already has customers using VDI
that require more VMs than this.

* The dilemma

When we began the 4.3 release cycle, this was one of the items we
identified as a key feature we needed to get for 4.3.  Wei Liu started
work on an extension of the existing implmentation, allowing 3 levels
of event channels.  The draft of this is ready, and just needs the
last bit of polishing and bug-chasing before it can be accepted.

However, several months ago, David Vrabel came up with an alternate
design which in theory was more scalable, based on queues of linked
lists (which we have internally been calling "FIFO" for short).  David
has been working on the implementation since, and has a draft
protoype; but it's in no shape to be included in 4.3.

There are some things that are attractive about the second solution,
including the flexible assignment of interrupt priorities, ease of
scalability, and potentially even the FIFO nature of the interrupt
delivery.

The question at hand then, is whether to take what we have in the
3-level implementation for 4.3, or wait to see how the FIFO
implementation turns out (taking either it or the 3-level
implementation in 4.4).

* The solution in hand: 3-level event channels

The basic idea behind 3-level event channels is to extend the existing
2-level implementation to 3 levels.  Going to 3 levels would give us
32k event channels for 32-bit, and 256k for 64-bit.

One of the advantages of this method is that since it is similar to
the existing method, the general concepts and race conditions are
fairly well understood and tested.

One of the disadvantages that this method inherits from the 2-level
event channels is the lack of priority.  In the initial implementation
of event channels, priority was handled by event channel order: scans
for events always started at 0 and went upwards.  However, this was
not very scalable, as lower-numbered events could easily completely
lock out higher-numbered events; and frequently "lower-numbered"
simply meant "created earlier".  Event channels were forced into a
priority even if one was not wanted.

So the implementation was tweaked, so that scans don't start at 0, but
continue where the last event left off.  This made it so that earlier
events were not prioritized and removed the starvation issue, but at
the cost of removing all event priorities.  Certain events, like the
timer event, are special-cased to be always checked, but this is
rather a bit of a hack and not very scalable or flexible.

One thing that should be noted is that adding the extra level is
envisoned only to be used by guests that need the extended event
channel space, such as dom0 and driver domains; domUs will continue to
use the 2-level version.

* The solution close at hand: FIFO event channels

The FIFO solution makes event delivery a matter of adding items to a
highly structured linked list.  The number of event channels for the
interface design has a theoretical maximum of 2^28; the current
implementation is limimited at 2^17, which is over 100,000.  The
number is the same for both 32-bit and 64-bit kernels.

One of the key design advantages of the FIFO is the ability to assign
an arbitrary priority to any event.  There are 16 priorities
available; one queue for each priority.  Higher-priority queues are
handled below lower-priority queues, but events within a queue are
handled in FIFO order.

Another potential advantage is the FIFO ordering.  With the current
event channel implementation, one can construct scenarios where even
with events of the same priority, clusters of events can lock out
others based on where they are or the number of them.  FIFO solves
this by handling events within the same priority strictly in the order
in which they were raised.  It's not clear yet, however, whether this
has a measurable impact on performance.

One of the potential disadvantages of the FIFO solution is the amount
of memory that it requires to be mapped into the Xen address space.
The FIFO solution requires an entire word per event channel; a
reasonably configured system might have up to 128 Xen-mapped pages per
dom0 or domU.  On the other hand, this number can be scaled at a
fine-grained level, and limited by the toolstack; a typical domU would
require only one page mapped in the hypervisor.

By comparison, the 3-level solution requires only two bits per event
channel.  Any domain using the extra level would require exactly 16
pages for 64-bit domains, and 2 pages for 32-bit domains.  We would
expect this to include dom0 and any driver domains, but that domUs
would continue using 2-level event channels (and thus require no extra
pages to be mapped).

* Considerations

There are a number of additional considerations to take into account.

The first is that the hypervisor maintainers have made it clear that
once 3-level event channels is accepted, FIFO will have a higher bar
to clear for acceptance.  That is, if we wait for the 4.4 timeframe
before choosing one to accept, then FIFO will only need to be
marginally preferrable to 3-level to be accepted.  However, if we
accept the 3-level implimentation for 4.3, then FIFO will need to
demonstrate that it is significantly better for 4.3 in order to be
accepted.

We are not yet aware of any companies that are blocked on this
feature.  Citrix XenServer clients using Citrix's VDI solution need to
be able to run more than 200 guests; however, because XenServer
control both the kernel and hypervisor side, they can introduce
temporary, non-backwards or forwards-compatible changes to work around
the limitation, and so are not blocked.  Oracle and SuSE have not
indicated that this a feature they are in dire need of.  Most cloud
deployments that we know of -- even extremely large ones like Amazon
or Rackspace -- use large numbers of relatively inexpensive computers,
and so typically do not need to run more than 200 VMs per physical
host.

Another factor to consider is that we are considering attempting a
shorter release cadence for 4.4 -- 6 months or possibly less.  That
means that the impact of delaying the event channel scalability
feature will be reduced.

* What we need to know

What we're missing in order to make an informed decision is voices
from the community: If we delay the event channel scalability feature
until 4.4, how likely is this to be an issue?  Are there current users
or potential users of Xen who need to be able to scale past 200 VMs on
a single host, and who would end up choosing another hypervisor if
this feature were delayed?

Thank you for your time and input.

 -George Dunlap,
  4.3 Release manager

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 11:23 Request for input: Extended event channel support George Dunlap
@ 2013-03-27 19:36 ` Anil Madhavapeddy
  2013-03-27 21:53   ` David Vrabel
  2013-03-28 12:51   ` Felipe Franciosi
  2013-03-28  1:56 ` Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Anil Madhavapeddy @ 2013-03-27 19:36 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-users, cl-mirage@lists.cam.ac.uk List, xen-devel

On 27 Mar 2013, at 11:23, George Dunlap <dunlapg@umich.edu> wrote:
> 
> The FIFO solution makes event delivery a matter of adding items to a
> highly structured linked list.  The number of event channels for the
> interface design has a theoretical maximum of 2^28; the current
> implementation is limimited at 2^17, which is over 100,000.  The
> number is the same for both 32-bit and 64-bit kernels.

Is there any reason for such a low default?  If I'm not mistaken,
every guest needs at least 2 event channels (console, xenstore) and
probably has two more for a net and disk device.

With stub-domains in the mix, we could easily imagine running 25,000
VMs with a couple of megabytes of RAM each using Mirage (which can
boot very low memory guests without too much trouble).  This does
run into other problems with CPU scheduling and device scalability,
but it would be nice if any proposed event channel upgrade went well
above this level rather than scrape it.

I personally prefer the 4.3 solution (despite the priority hack for
the timers) just because the existing limitation is so very trivial
to hit.  However, I have no view on the level of technical debt that
would incur if it subsequently required switching to the FIFO
solution in 4.4 and causing yet another round of upgrades.  That's
your problem; I just want the extra domains :-)

-anil

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 19:36 ` Anil Madhavapeddy
@ 2013-03-27 21:53   ` David Vrabel
  2013-03-27 22:28     ` Anil Madhavapeddy
  2013-03-27 22:31     ` Wei Liu
  2013-03-28 12:51   ` Felipe Franciosi
  1 sibling, 2 replies; 19+ messages in thread
From: David Vrabel @ 2013-03-27 21:53 UTC (permalink / raw)
  To: Anil Madhavapeddy
  Cc: xen-users, George Dunlap, cl-mirage@lists.cam.ac.uk List,
	xen-devel

On 27/03/2013 19:36, Anil Madhavapeddy wrote:
> On 27 Mar 2013, at 11:23, George Dunlap <dunlapg@umich.edu> wrote:
>>
>> The FIFO solution makes event delivery a matter of adding items to a
>> highly structured linked list.  The number of event channels for the
>> interface design has a theoretical maximum of 2^28; the current
>> implementation is limimited at 2^17, which is over 100,000.  The
>> number is the same for both 32-bit and 64-bit kernels.
> 
> Is there any reason for such a low default?  If I'm not mistaken,
> every guest needs at least 2 event channels (console, xenstore) and
> probably has two more for a net and disk device.

131,072 seemed high enough to me but I'd forgotten about the Mirage use
case.

This can be trivially raised to 2^19 (524,288).  Beyond that, the
implementation becomes slightly more complex as the pointers to the
event array pages no longer fit in a single page.

> With stub-domains in the mix, we could easily imagine running 25,000
> VMs with a couple of megabytes of RAM each using Mirage (which can
> boot very low memory guests without too much trouble).

Having said that, with 25,000 VMs it would seem sensible to disaggregate
things like the console and xenstore (in addition to the network and
block backends). Thus reducing the need for event channels for any
single domain.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 21:53   ` David Vrabel
@ 2013-03-27 22:28     ` Anil Madhavapeddy
  2013-03-27 22:31     ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Anil Madhavapeddy @ 2013-03-27 22:28 UTC (permalink / raw)
  To: David Vrabel
  Cc: xen-users, George Dunlap, cl-mirage@lists.cam.ac.uk List,
	xen-devel

On 27 Mar 2013, at 21:53, David Vrabel <dvrabel@cantab.net> wrote:
> On 27/03/2013 19:36, Anil Madhavapeddy wrote:
>> On 27 Mar 2013, at 11:23, George Dunlap <dunlapg@umich.edu> wrote:
>>> 
>>> The FIFO solution makes event delivery a matter of adding items to a
>>> highly structured linked list.  The number of event channels for the
>>> interface design has a theoretical maximum of 2^28; the current
>>> implementation is limimited at 2^17, which is over 100,000.  The
>>> number is the same for both 32-bit and 64-bit kernels.
>> 
>> Is there any reason for such a low default?  If I'm not mistaken,
>> every guest needs at least 2 event channels (console, xenstore) and
>> probably has two more for a net and disk device.
> 
> 131,072 seemed high enough to me but I'd forgotten about the Mirage use
> case.
> 
> This can be trivially raised to 2^19 (524,288).  Beyond that, the
> implementation becomes slightly more complex as the pointers to the
> event array pages no longer fit in a single page.

Makes sense.

>> With stub-domains in the mix, we could easily imagine running 25,000
>> VMs with a couple of megabytes of RAM each using Mirage (which can
>> boot very low memory guests without too much trouble).
> 
> Having said that, with 25,000 VMs it would seem sensible to disaggregate
> things like the console and xenstore (in addition to the network and
> block backends). Thus reducing the need for event channels for any
> single domain.


Yeah indeed; this should be pretty easy to do and let the existing
2^17 be enough for a long time too.  We'd need to think a bit about a
distributed xenstored to avoid having one hotspot servicing so many
VMs.

One nice thing about the OCaml xenstored is that it should be possible
to make an explicitly distributed implementation of the protocol.
The data-structure is already based around immutable trees, so it's a
matter of figuring out where to put the consensus logic (probably around
/local/domain/*).

-anil

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 21:53   ` David Vrabel
  2013-03-27 22:28     ` Anil Madhavapeddy
@ 2013-03-27 22:31     ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Wei Liu @ 2013-03-27 22:31 UTC (permalink / raw)
  To: David Vrabel
  Cc: xen-users, George Dunlap, xen-devel@lists.xen.org,
	cl-mirage@lists.cam.ac.uk List, Anil Madhavapeddy

On Wed, Mar 27, 2013 at 9:53 PM, David Vrabel <dvrabel@cantab.net> wrote:
> On 27/03/2013 19:36, Anil Madhavapeddy wrote:
>> On 27 Mar 2013, at 11:23, George Dunlap <dunlapg@umich.edu> wrote:
>>>
>>> The FIFO solution makes event delivery a matter of adding items to a
>>> highly structured linked list.  The number of event channels for the
>>> interface design has a theoretical maximum of 2^28; the current
>>> implementation is limimited at 2^17, which is over 100,000.  The
>>> number is the same for both 32-bit and 64-bit kernels.
>>
>> Is there any reason for such a low default?  If I'm not mistaken,
>> every guest needs at least 2 event channels (console, xenstore) and
>> probably has two more for a net and disk device.
>
> 131,072 seemed high enough to me but I'd forgotten about the Mirage use
> case.
>
> This can be trivially raised to 2^19 (524,288).  Beyond that, the
> implementation becomes slightly more complex as the pointers to the
> event array pages no longer fit in a single page.
>

Then that would require 512 pages mapped in Xen in the worst case, plus

>> With stub-domains in the mix, we could easily imagine running 25,000
>> VMs with a couple of megabytes of RAM each using Mirage (which can
>> boot very low memory guests without too much trouble).
>

25,000 pages for domUs if domU uses this ABI as well.

This might require bumping global mapping space in Xen, or we can
restrict domU to only use default 2-level ABI to solve this problem.

But let's not worry about future things for now.


Wei.

> Having said that, with 25,000 VMs it would seem sensible to disaggregate
> things like the console and xenstore (in addition to the network and
> block backends). Thus reducing the need for event channels for any
> single domain.
>
> David
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 11:23 Request for input: Extended event channel support George Dunlap
  2013-03-27 19:36 ` Anil Madhavapeddy
@ 2013-03-28  1:56 ` Konrad Rzeszutek Wilk
  2013-03-28 11:10   ` George Dunlap
  2013-03-29 13:05 ` Konrad Rzeszutek Wilk
  2013-04-04 13:31 ` George Dunlap
  3 siblings, 1 reply; 19+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-28  1:56 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-users@lists.xen.org, xen-devel@lists.xen.org

On Wed, Mar 27, 2013 at 11:23:23AM +0000, George Dunlap wrote:
> * Executive summary
> 
> The number of event channels available for dom0 is currently one of
> the biggest limitations on scaling up the number of VMs which can be
> created on a single system.  There are two alternative implementations
> we could choose, one of which is ready now, the other of which is
> potentially technically superior, but will not be ready for the 4.3
> release.
> 
> The core question we need to ask the community: How important is
> lifting the event channel scalability limit to 4.3?  Will waiting
> until 4.4 cause a limit in the uptake of the Xen platform?
> 
> * The issue
> 
> The existing event channel implementation for PV guests is implemented
> as 2-level bit array.  This limits the total number of event channels
> to word_size ^ 2, which is 1024 for 32-bit guests and 4096 for 64-bit
> guests.
> 
> This sounds like a lot, until you consider that in a typical system,
> each VM needs 4 or more event channels in domain 0.  This means that
> for a 32-bit dom0, there is a theoretical maximum of 256 guests -- and
> in practice it's more like 180 or so, because of event channels
> required for other things.  XenServer already has customers using VDI
> that require more VMs than this.
> 
> * The dilemma
> 
> When we began the 4.3 release cycle, this was one of the items we
> identified as a key feature we needed to get for 4.3.  Wei Liu started
> work on an extension of the existing implmentation, allowing 3 levels
> of event channels.  The draft of this is ready, and just needs the
> last bit of polishing and bug-chasing before it can be accepted.
> 
> However, several months ago, David Vrabel came up with an alternate
> design which in theory was more scalable, based on queues of linked
> lists (which we have internally been calling "FIFO" for short).  David
> has been working on the implementation since, and has a draft
> protoype; but it's in no shape to be included in 4.3.
> 
> There are some things that are attractive about the second solution,
> including the flexible assignment of interrupt priorities, ease of
> scalability, and potentially even the FIFO nature of the interrupt
> delivery.
> 
> The question at hand then, is whether to take what we have in the
> 3-level implementation for 4.3, or wait to see how the FIFO
> implementation turns out (taking either it or the 3-level
> implementation in 4.4).
> 
> * The solution in hand: 3-level event channels
> 
> The basic idea behind 3-level event channels is to extend the existing
> 2-level implementation to 3 levels.  Going to 3 levels would give us
> 32k event channels for 32-bit, and 256k for 64-bit.
> 
> One of the advantages of this method is that since it is similar to
> the existing method, the general concepts and race conditions are
> fairly well understood and tested.
> 
> One of the disadvantages that this method inherits from the 2-level
> event channels is the lack of priority.  In the initial implementation
> of event channels, priority was handled by event channel order: scans
> for events always started at 0 and went upwards.  However, this was
> not very scalable, as lower-numbered events could easily completely
> lock out higher-numbered events; and frequently "lower-numbered"
> simply meant "created earlier".  Event channels were forced into a
> priority even if one was not wanted.
> 
> So the implementation was tweaked, so that scans don't start at 0, but
> continue where the last event left off.  This made it so that earlier
> events were not prioritized and removed the starvation issue, but at
> the cost of removing all event priorities.  Certain events, like the
> timer event, are special-cased to be always checked, but this is
> rather a bit of a hack and not very scalable or flexible.

Hm, I actually think that is not in the upstream kernel at all. That
would explain why on very heavily busy guest the hrtimer: interrupt
took XXxXXXXxx ns is printed.

Is this patch somewhere available?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-28  1:56 ` Konrad Rzeszutek Wilk
@ 2013-03-28 11:10   ` George Dunlap
  2013-03-28 11:34     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2013-03-28 11:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: David Vrabel, xen-devel@lists.xen.org

On Thu, Mar 28, 2013 at 1:56 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
>> So the implementation was tweaked, so that scans don't start at 0, but
>> continue where the last event left off.  This made it so that earlier
>> events were not prioritized and removed the starvation issue, but at
>> the cost of removing all event priorities.  Certain events, like the
>> timer event, are special-cased to be always checked, but this is
>> rather a bit of a hack and not very scalable or flexible.
>
> Hm, I actually think that is not in the upstream kernel at all. That
> would explain why on very heavily busy guest the hrtimer: interrupt
> took XXxXXXXxx ns is printed.
>
> Is this patch somewhere available?

I think it was David who told me this -- maybe there is such a hack on
the "classic xen" kernel we're using in XenServer?

 -George

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-28 11:10   ` George Dunlap
@ 2013-03-28 11:34     ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2013-03-28 11:34 UTC (permalink / raw)
  To: George Dunlap, Konrad Rzeszutek Wilk
  Cc: David Vrabel, xen-devel@lists.xen.org

>>> On 28.03.13 at 12:10, George Dunlap <George.Dunlap@eu.citrix.com> wrote:
> On Thu, Mar 28, 2013 at 1:56 AM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>>> So the implementation was tweaked, so that scans don't start at 0, but
>>> continue where the last event left off.  This made it so that earlier
>>> events were not prioritized and removed the starvation issue, but at
>>> the cost of removing all event priorities.  Certain events, like the
>>> timer event, are special-cased to be always checked, but this is
>>> rather a bit of a hack and not very scalable or flexible.
>>
>> Hm, I actually think that is not in the upstream kernel at all. That
>> would explain why on very heavily busy guest the hrtimer: interrupt
>> took XXxXXXXxx ns is printed.
>>
>> Is this patch somewhere available?
> 
> I think it was David who told me this -- maybe there is such a hack on
> the "classic xen" kernel we're using in XenServer?

Indeed - see 1038:a66a7c64b1d0  on linux-2.6.18-xen.hg.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 19:36 ` Anil Madhavapeddy
  2013-03-27 21:53   ` David Vrabel
@ 2013-03-28 12:51   ` Felipe Franciosi
  2013-03-28 12:54     ` Anil Madhavapeddy
  2013-04-10 10:45     ` Ian Campbell
  1 sibling, 2 replies; 19+ messages in thread
From: Felipe Franciosi @ 2013-03-28 12:51 UTC (permalink / raw)
  To: 'Anil Madhavapeddy'
  Cc: xen-users@lists.xen.org, George Dunlap,
	cl-mirage@lists.cam.ac.uk List, xen-devel@lists.xen.org

-----Original Message-----
From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Anil Madhavapeddy
Sent: 27 March 2013 19:37
To: George Dunlap
Cc: xen-users@lists.xen.org; cl-mirage@lists.cam.ac.uk List; xen-devel@lists.xen.org
Subject: Re: [Xen-devel] Request for input: Extended event channel support


> If I'm not mistaken, every guest needs at least 2 event channels (console, xenstore) and probably has two more for a net and disk device.

Presumably for vCPUs as well IINM?

Felipe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-28 12:51   ` Felipe Franciosi
@ 2013-03-28 12:54     ` Anil Madhavapeddy
  2013-03-28 13:02       ` Felipe Franciosi
  2013-04-10 10:45       ` [Xen-users] " Ian Campbell
  2013-04-10 10:45     ` Ian Campbell
  1 sibling, 2 replies; 19+ messages in thread
From: Anil Madhavapeddy @ 2013-03-28 12:54 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: xen-users@lists.xen.org, George Dunlap,
	cl-mirage@lists.cam.ac.uk List, xen-devel@lists.xen.org


On 28 Mar 2013, at 12:51, Felipe Franciosi <felipe.franciosi@citrix.com> wrote:

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Anil Madhavapeddy
> Sent: 27 March 2013 19:37
> To: George Dunlap
> Cc: xen-users@lists.xen.org; cl-mirage@lists.cam.ac.uk List; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Request for input: Extended event channel support
> 
> 
>> If I'm not mistaken, every guest needs at least 2 event channels (console, xenstore) and probably has two more for a net and disk device.
> 
> Presumably for vCPUs as well IINM?

Yes, except that in Mirage's case we're single vCPU only, and use multiple VMs to act as parallel processes with explicit message passing.

But we would still need an event channel for the vchan shared ring, in this case too...

-anil

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-28 12:54     ` Anil Madhavapeddy
@ 2013-03-28 13:02       ` Felipe Franciosi
  2013-04-10 10:45       ` [Xen-users] " Ian Campbell
  1 sibling, 0 replies; 19+ messages in thread
From: Felipe Franciosi @ 2013-03-28 13:02 UTC (permalink / raw)
  To: 'Anil Madhavapeddy'
  Cc: xen-users@lists.xen.org, George Dunlap,
	cl-mirage@lists.cam.ac.uk List, xen-devel@lists.xen.org



-----Original Message-----
From: Anil Madhavapeddy [mailto:anil@recoil.org] 
Sent: 28 March 2013 12:54
To: Felipe Franciosi
Cc: xen-users@lists.xen.org; George Dunlap; cl-mirage@lists.cam.ac.uk List; xen-devel@lists.xen.org
Subject: Re: [Xen-devel] Request for input: Extended event channel support

>> Presumably for vCPUs as well IINM?

>Yes, except that in Mirage's case we're single vCPU only, and use multiple VMs to act as parallel processes with explicit message passing.

There's also the buffered IO event channel, but I'm pretty sure this is only for HVM so shouldn't affect the Mirage use case.
Just mentioning in case there is someone out there reading this and working out numbers for HVM guests. :)

http://lists.xen.org/archives/html/xen-changelog/2011-11/msg00139.html

Cheers,
Felipe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 11:23 Request for input: Extended event channel support George Dunlap
  2013-03-27 19:36 ` Anil Madhavapeddy
  2013-03-28  1:56 ` Konrad Rzeszutek Wilk
@ 2013-03-29 13:05 ` Konrad Rzeszutek Wilk
  2013-04-02  7:44   ` Jan Beulich
  2013-04-04 13:31 ` George Dunlap
  3 siblings, 1 reply; 19+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-03-29 13:05 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-users@lists.xen.org, xen-devel@lists.xen.org

> * What we need to know
> 
> What we're missing in order to make an informed decision is voices
> from the community: If we delay the event channel scalability feature
> until 4.4, how likely is this to be an issue?  Are there current users
> or potential users of Xen who need to be able to scale past 200 VMs on
> a single host, and who would end up choosing another hypervisor if
> this feature were delayed?

For this to work you also need the Linux side patches. That means
that if you want to hit this in v3.10 merge window you have until
April 15th to get it in. The reason is that I am out from
April 20th, and the merge window will probably be open on May 1st.

We need at least one week to work out any bugs when it goes
in #linux-next - hence the April 15th deadline.

Technically sounding, the FIFO looks more appealing than the three
level events, but that is a subjective opinion.

The reality is that what should be really determined is which one
will give better performance. From a design perspective it looks
as FIFO is the clear winner, but perhaps not - I only briefly looked
over the paper?

Anyhow, I am leaning towards the FIFO - but I think that if there are
existing people who want this functionality _Right now_, then
the 3-level event channels would offer a stop-gate option. And they
can apply it to their hypervisor + Linux by hand right?

> 
> Thank you for your time and input.
> 
>  -George Dunlap,
>   4.3 Release manager
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-29 13:05 ` Konrad Rzeszutek Wilk
@ 2013-04-02  7:44   ` Jan Beulich
  2013-04-02 14:20     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2013-04-02  7:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, George Dunlap
  Cc: xen-users@lists.xen.org, xen-devel@lists.xen.org

>>> On 29.03.13 at 14:05, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>>  * What we need to know
>> 
>> What we're missing in order to make an informed decision is voices
>> from the community: If we delay the event channel scalability feature
>> until 4.4, how likely is this to be an issue?  Are there current users
>> or potential users of Xen who need to be able to scale past 200 VMs on
>> a single host, and who would end up choosing another hypervisor if
>> this feature were delayed?
> 
> For this to work you also need the Linux side patches. That means
> that if you want to hit this in v3.10 merge window you have until
> April 15th to get it in. The reason is that I am out from
> April 20th, and the merge window will probably be open on May 1st.

I don't think upstream inclusion of the Linux side patches is a
requirement here. The patches need to exist (or else the code
can't be tested), but there's no need for them to be in 3.10 as
far as the interface selection if concerned.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-04-02  7:44   ` Jan Beulich
@ 2013-04-02 14:20     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 19+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-04-02 14:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-users@lists.xen.org, George Dunlap, xen-devel@lists.xen.org

On Tue, Apr 02, 2013 at 08:44:47AM +0100, Jan Beulich wrote:
> >>> On 29.03.13 at 14:05, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> >>  * What we need to know
> >> 
> >> What we're missing in order to make an informed decision is voices
> >> from the community: If we delay the event channel scalability feature
> >> until 4.4, how likely is this to be an issue?  Are there current users
> >> or potential users of Xen who need to be able to scale past 200 VMs on
> >> a single host, and who would end up choosing another hypervisor if
> >> this feature were delayed?
> > 
> > For this to work you also need the Linux side patches. That means
> > that if you want to hit this in v3.10 merge window you have until
> > April 15th to get it in. The reason is that I am out from
> > April 20th, and the merge window will probably be open on May 1st.
> 
> I don't think upstream inclusion of the Linux side patches is a
> requirement here. The patches need to exist (or else the code
> can't be tested), but there's no need for them to be in 3.10 as
> far as the interface selection if concerned.
> 

OK.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-27 11:23 Request for input: Extended event channel support George Dunlap
                   ` (2 preceding siblings ...)
  2013-03-29 13:05 ` Konrad Rzeszutek Wilk
@ 2013-04-04 13:31 ` George Dunlap
  2013-04-10 10:49   ` [Xen-users] " Ian Campbell
  3 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2013-04-04 13:31 UTC (permalink / raw)
  To: xen-devel@lists.xen.org, xen-users@lists.xen.org

On Wed, Mar 27, 2013 at 11:23 AM, George Dunlap <dunlapg@umich.edu> wrote:
> * Executive summary
>
> The number of event channels available for dom0 is currently one of
> the biggest limitations on scaling up the number of VMs which can be
> created on a single system.  There are two alternative implementations
> we could choose, one of which is ready now, the other of which is
> potentially technically superior, but will not be ready for the 4.3
> release.
>
> The core question we need to ask the community: How important is
> lifting the event channel scalability limit to 4.3?  Will waiting
> until 4.4 cause a limit in the uptake of the Xen platform?

So far the only one who has indicated a preference either way is Anil,
who is impatient to be rid of the limit on the number of tiny Mirage
VMs he can create. :-)

I think overall then I'm leaning towards recommending that we put the
decision off until 4.4.

 -George

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Request for input: Extended event channel support
  2013-03-28 12:51   ` Felipe Franciosi
  2013-03-28 12:54     ` Anil Madhavapeddy
@ 2013-04-10 10:45     ` Ian Campbell
  1 sibling, 0 replies; 19+ messages in thread
From: Ian Campbell @ 2013-04-10 10:45 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: xen-users@lists.xen.org, George Dunlap, xen-devel@lists.xen.org,
	cl-mirage@lists.cam.ac.uk List, 'Anil Madhavapeddy'

On Thu, 2013-03-28 at 12:51 +0000, Felipe Franciosi wrote:
> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Anil Madhavapeddy
> Sent: 27 March 2013 19:37
> To: George Dunlap
> Cc: xen-users@lists.xen.org; cl-mirage@lists.cam.ac.uk List; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Request for input: Extended event channel support
> 
> 
> > If I'm not mistaken, every guest needs at least 2 event channels (console, xenstore) and probably has two more for a net and disk device.
> 
> Presumably for vCPUs as well IINM?

Aren't those (the vcpu IPI event channels, timers etc) internal to the
guest though? The limit we want to count here are eventchannels with an
end point inside dom0.

Ian.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-users] Request for input: Extended event channel support
  2013-03-28 12:54     ` Anil Madhavapeddy
  2013-03-28 13:02       ` Felipe Franciosi
@ 2013-04-10 10:45       ` Ian Campbell
  2013-04-10 16:14         ` Anil Madhavapeddy
  1 sibling, 1 reply; 19+ messages in thread
From: Ian Campbell @ 2013-04-10 10:45 UTC (permalink / raw)
  To: Anil Madhavapeddy
  Cc: xen-users@lists.xen.org, George Dunlap, Felipe Franciosi,
	cl-mirage@lists.cam.ac.uk List, xen-devel@lists.xen.org

On Thu, 2013-03-28 at 12:54 +0000, Anil Madhavapeddy wrote:
> Yes, except that in Mirage's case we're single vCPU only, and use
> multiple VMs to act as parallel processes with explicit message
> passing.
> 
> But we would still need an event channel for the vchan shared ring, in
> this case too... 

That would be between two mirage guests though, unless you are
envisaging mirage "processes" with hundreds of thousands of "threads"?

Ian.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-users] Request for input: Extended event channel support
  2013-04-04 13:31 ` George Dunlap
@ 2013-04-10 10:49   ` Ian Campbell
  0 siblings, 0 replies; 19+ messages in thread
From: Ian Campbell @ 2013-04-10 10:49 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-users@lists.xen.org, xen-devel@lists.xen.org

On Thu, 2013-04-04 at 14:31 +0100, George Dunlap wrote:
> I think overall then I'm leaning towards recommending that we put the
> decision off until 4.4.

FWIW that's the way I'm leaning too. In the absence of lots of loud
clamouring it seems there is no need to rush so the default should be to
defer.

Ian.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-users] Request for input: Extended event channel support
  2013-04-10 10:45       ` [Xen-users] " Ian Campbell
@ 2013-04-10 16:14         ` Anil Madhavapeddy
  0 siblings, 0 replies; 19+ messages in thread
From: Anil Madhavapeddy @ 2013-04-10 16:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-users@lists.xen.org, George Dunlap, Felipe Franciosi,
	cl-mirage@lists.cam.ac.uk List, xen-devel@lists.xen.org

On 10 Apr 2013, at 03:45, Ian Campbell <Ian.Campbell@citrix.com> wrote:

> On Thu, 2013-03-28 at 12:54 +0000, Anil Madhavapeddy wrote:
>> Yes, except that in Mirage's case we're single vCPU only, and use
>> multiple VMs to act as parallel processes with explicit message
>> passing.
>> 
>> But we would still need an event channel for the vchan shared ring, in
>> this case too... 
> 
> That would be between two mirage guests though, unless you are
> envisaging mirage "processes" with hundreds of thousands of "threads"?

That's correct: most of channels should be directly between guests and
not to dom0.  It's convenient to be able to do this via dom0 for some
services such as xenstore/xenconsoled, but we could work around this
without too much difficulty.

George: my use case certainly isn't a blocker for 4.3.  We can maintain
local patches for this specialised use case.

-anil

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-04-10 16:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 11:23 Request for input: Extended event channel support George Dunlap
2013-03-27 19:36 ` Anil Madhavapeddy
2013-03-27 21:53   ` David Vrabel
2013-03-27 22:28     ` Anil Madhavapeddy
2013-03-27 22:31     ` Wei Liu
2013-03-28 12:51   ` Felipe Franciosi
2013-03-28 12:54     ` Anil Madhavapeddy
2013-03-28 13:02       ` Felipe Franciosi
2013-04-10 10:45       ` [Xen-users] " Ian Campbell
2013-04-10 16:14         ` Anil Madhavapeddy
2013-04-10 10:45     ` Ian Campbell
2013-03-28  1:56 ` Konrad Rzeszutek Wilk
2013-03-28 11:10   ` George Dunlap
2013-03-28 11:34     ` Jan Beulich
2013-03-29 13:05 ` Konrad Rzeszutek Wilk
2013-04-02  7:44   ` Jan Beulich
2013-04-02 14:20     ` Konrad Rzeszutek Wilk
2013-04-04 13:31 ` George Dunlap
2013-04-10 10:49   ` [Xen-users] " Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).