Re: [RFC] Extending numbers of event channels

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wei Liu <Wei.Liu2@citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: wei.liu2@citrix.com, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [RFC] Extending numbers of event channels
Date: Mon, 3 Dec 2012 18:15:16 +0000	[thread overview]
Message-ID: <1354558516.18784.31.camel@iceland> (raw)
In-Reply-To: <1354557455.2693.46.camel@zakaz.uk.xensource.com>

On Mon, 2012-12-03 at 17:57 +0000, Ian Campbell wrote:
> On Mon, 2012-12-03 at 17:52 +0000, Wei Liu wrote:
> > On Mon, 2012-12-03 at 17:35 +0000, Jan Beulich wrote:
> > > >>> On 03.12.12 at 17:29, Wei Liu <Wei.Liu2@citrix.com> wrote:
> > > > Regarding Jan's comment in [0], I don't think allowing user to specify
> > > > arbitrary number of levels a good idea. Because only the last level
> > > > should be shared among vcpus, other level should be in percpu struct to
> > > > allow for quicker lookup. The idea to let user specify levels will be
> > > > too complicated in implementation and blow up percpu section (since the
> > > > size grows exponentially). Three levels should be quite enough. See
> > > > maths below.
> > > 
> > > I didn't ask to implement more than three levels, I just asked for
> > > the interface to establish the number of levels a guest wants to
> > > use to allow for higher numbers (passing of which would result in
> > > -EINVAL in your implementation).
> > > 
> > 
> > Ah, I understand now. How about something like this:
> > 
> > struct EVTCHNOP_reg_nlevel {
> >     int levels;
> >     void *level_specified_reg_struct;
> > }
> > 
> > > > Number of event channels:
> > > >  * 32bit: 1024 * sizeof(unsigned long long) * BITS_PER_BYTE = 64k
> > > >  * 64bit: 4096 * sizeof(unsigned long long) * BITS_PER_BYTE = 512k
> > > > Basically the third level is a new ABI, so I choose to use unsigned long
> > > > long here to get more event channels.
> > > 
> > > Please don't: This would make things less consistent to handle
> > > at least in the guest side code. And I don't see why you would
> > > have a need to do so anyway (or else your argument above
> > > against further levels would become questionable).
> > > 
> > 
> > It was suggested by Ian to use unsigned long long. Ian, why do you
> > prefer unsigned long long to unsigned long?
> 
> I thought having 32 and 64 bit be the same might simplify some things,
> but if not then that's fine.
> 
> Is 32k event channels going to be enough in the long run? I suppose any
> system capable of running such a number of guests ought to be using 64
> bit == 512k which should at least last a bit longer.
> 

I think 32k is quite enough for 32bit machines. And I agree with "system
capable of running such a number of guests ought to be using 64 bit ==
512k" ;-)

> > > > Pages occupied by the third level (if PAGE_SIZE=4k):
> > > >  * 32bit: 64k  / 8 / 4k = 2
> > > >  * 64bit: 512k / 8 / 4k = 16
> > > > 
> > > > Making second level percpu will incur overhead. In fact we move the
> > > > array in shared info into percpu struct:
> > > >  * 32bit: sizeof(unsigned long) * 8 * sizeof(unsigned long) = 128 byte
> > > >  * 64bit: sizeof(unsigned long) * 8 * sizeof(unsigned long) = 512 byte
> > > > 
> > > > What concerns me is that the struct evtchn buckets are allocated all at
> > > > once during initialization phrase. To save memory inside Xen, the
> > > > internal allocation/free scheme for evtchn needs to be modified. Ian
> > > > suggested we do small number of buckets at start of day then dynamically
> > > > fault in more as required.
> > > > 
> > > > To sum up:
> > > >      1. Guest should allocate pages for third level evtchn.
> > > >      2. Guest should register third level pages via a new hypercall op.
> > > 
> > > Doesn't the guest also need to set up space for the 2nd level?
> > > 
> > 
> > Yes. That will be embedded in percpu struct vcpu_info, which will be
> > also register via the same hypercall op.
> 
> NB that there is already a vcpu info placement hypercall. I have no
> problem making this be a prerequisite for this work.
> 

I saw that one. But that's something down to implementation, so I didn't
go into details.


Wei.

next prev parent reply	other threads:[~2012-12-03 18:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-03 16:29 [RFC] Extending numbers of event channels Wei Liu
2012-12-03 17:35 ` Jan Beulich
2012-12-03 17:52   ` Wei Liu
2012-12-03 17:57     ` Ian Campbell
2012-12-03 18:15       ` Wei Liu [this message]
2012-12-03 18:00     ` Jan Beulich
2012-12-03 18:09       ` Wei Liu
2012-12-04  8:05         ` Jan Beulich
2012-12-04  9:30           ` Ian Campbell
2012-12-04  9:37             ` Jan Beulich
2012-12-03 17:43 ` Ian Campbell
2012-12-03 17:48   ` Jan Beulich
2012-12-03 17:50     ` Ian Campbell
2012-12-03 18:52 ` David Vrabel
2012-12-03 19:11   ` Wei Liu
2012-12-03 20:56   ` Wei Liu
2012-12-04 11:35     ` David Vrabel
2012-12-06 10:03       ` Tim Deegan
2012-12-04 11:29   ` George Dunlap
2012-12-04 13:45     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1354558516.18784.31.camel@iceland \
    --to=wei.liu2@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.