Re: [RFC] Extending numbers of event channels

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Wei Liu <Wei.Liu2@citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: wei.liu2@citrix.com, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [RFC] Extending numbers of event channels
Date: Mon, 3 Dec 2012 18:15:16 +0000	[thread overview]
Message-ID: <1354558516.18784.31.camel@iceland> (raw)
In-Reply-To: <1354557455.2693.46.camel@zakaz.uk.xensource.com>

On Mon, 2012-12-03 at 17:57 +0000, Ian Campbell wrote:
> On Mon, 2012-12-03 at 17:52 +0000, Wei Liu wrote:
> > On Mon, 2012-12-03 at 17:35 +0000, Jan Beulich wrote:
> > > >>> On 03.12.12 at 17:29, Wei Liu <Wei.Liu2@citrix.com> wrote:
> > > > Regarding Jan's comment in [0], I don't think allowing user to specify
> > > > arbitrary number of levels a good idea. Because only the last level
> > > > should be shared among vcpus, other level should be in percpu struct to
> > > > allow for quicker lookup. The idea to let user specify levels will be
> > > > too complicated in implementation and blow up percpu section (since the
> > > > size grows exponentially). Three levels should be quite enough. See
> > > > maths below.
> > > 
> > > I didn't ask to implement more than three levels, I just asked for
> > > the interface to establish the number of levels a guest wants to
> > > use to allow for higher numbers (passing of which would result in
> > > -EINVAL in your implementation).
> > > 
> > 
> > Ah, I understand now. How about something like this:
> > 
> > struct EVTCHNOP_reg_nlevel {
> >     int levels;
> >     void *level_specified_reg_struct;
> > }
> > 
> > > > Number of event channels:
> > > >  * 32bit: 1024 * sizeof(unsigned long long) * BITS_PER_BYTE = 64k
> > > >  * 64bit: 4096 * sizeof(unsigned long long) * BITS_PER_BYTE = 512k
> > > > Basically the third level is a new ABI, so I choose to use unsigned long
> > > > long here to get more event channels.
> > > 
> > > Please don't: This would make things less consistent to handle
> > > at least in the guest side code. And I don't see why you would
> > > have a need to do so anyway (or else your argument above
> > > against further levels would become questionable).
> > > 
> > 
> > It was suggested by Ian to use unsigned long long. Ian, why do you
> > prefer unsigned long long to unsigned long?
> 
> I thought having 32 and 64 bit be the same might simplify some things,
> but if not then that's fine.
> 
> Is 32k event channels going to be enough in the long run? I suppose any
> system capable of running such a number of guests ought to be using 64
> bit == 512k which should at least last a bit longer.
> 

I think 32k is quite enough for 32bit machines. And I agree with "system
capable of running such a number of guests ought to be using 64 bit ==
512k" ;-)

> > > > Pages occupied by the third level (if PAGE_SIZE=4k):
> > > >  * 32bit: 64k  / 8 / 4k = 2
> > > >  * 64bit: 512k / 8 / 4k = 16
> > > > 
> > > > Making second level percpu will incur overhead. In fact we move the
> > > > array in shared info into percpu struct:
> > > >  * 32bit: sizeof(unsigned long) * 8 * sizeof(unsigned long) = 128 byte
> > > >  * 64bit: sizeof(unsigned long) * 8 * sizeof(unsigned long) = 512 byte
> > > > 
> > > > What concerns me is that the struct evtchn buckets are allocated all at
> > > > once during initialization phrase. To save memory inside Xen, the
> > > > internal allocation/free scheme for evtchn needs to be modified. Ian
> > > > suggested we do small number of buckets at start of day then dynamically
> > > > fault in more as required.
> > > > 
> > > > To sum up:
> > > >      1. Guest should allocate pages for third level evtchn.
> > > >      2. Guest should register third level pages via a new hypercall op.
> > > 
> > > Doesn't the guest also need to set up space for the 2nd level?
> > > 
> > 
> > Yes. That will be embedded in percpu struct vcpu_info, which will be
> > also register via the same hypercall op.
> 
> NB that there is already a vcpu info placement hypercall. I have no
> problem making this be a prerequisite for this work.
> 

I saw that one. But that's something down to implementation, so I didn't
go into details.


Wei.

next prev parent reply	other threads:[~2012-12-03 18:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-03 16:29 [RFC] Extending numbers of event channels Wei Liu
2012-12-03 17:35 ` Jan Beulich
2012-12-03 17:52   ` Wei Liu
2012-12-03 17:57     ` Ian Campbell
2012-12-03 18:15       ` Wei Liu [this message]
2012-12-03 18:00     ` Jan Beulich
2012-12-03 18:09       ` Wei Liu
2012-12-04  8:05         ` Jan Beulich
2012-12-04  9:30           ` Ian Campbell
2012-12-04  9:37             ` Jan Beulich
2012-12-03 17:43 ` Ian Campbell
2012-12-03 17:48   ` Jan Beulich
2012-12-03 17:50     ` Ian Campbell
2012-12-03 18:52 ` David Vrabel
2012-12-03 19:11   ` Wei Liu
2012-12-03 20:56   ` Wei Liu
2012-12-04 11:35     ` David Vrabel
2012-12-06 10:03       ` Tim Deegan
2012-12-04 11:29   ` George Dunlap
2012-12-04 13:45     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1354558516.18784.31.camel@iceland \
    --to=wei.liu2@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).