All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Jan Beulich <jbeulich@suse.com>,
	xen-devel@lists.xen.org
Subject: Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available
Date: Tue, 24 Sep 2013 10:04:18 -0700 (PDT)	[thread overview]
Message-ID: <20130924170418.GA14139@phenom.dumpdata.com> (raw)
In-Reply-To: <5241C27A.2050900@cantab.net>

On Tue, Sep 24, 2013 at 05:48:58PM +0100, David Vrabel wrote:
> On 24/09/2013 16:50, Konrad Rzeszutek Wilk wrote:
> > On Fri, Sep 13, 2013 at 06:00:01PM +0100, David Vrabel wrote:
> >> +
> >> +  error:
> >> +	if (event_array_pages == 0)
> >> +		panic("xen: unable to expand event array with initial page (%d)\n", ret);
> > 
> > Shouldn't we just printk and return a negative value so we can use
> > the old style event channels?
> 
> No, once you've called EVTCHNOP_init_control for at least one VCPU the
> hypervisor has switched ABIs and there is (deliberatly[*]) no mechanism
> to switch back.
> 
> [*] it greatly simplifies the hypervisor ABI and implementation if we
> don't have to unwind a half initialized switch over.  This failure case
> will never happen in practise as the hypervisor will run out of domheap
> pages for the guest memory before it runs out of global mapping space.

OK. And that is nicely mentioned in the design doc which I failed to spot
until now.
> 
> >> +	else
> >> +		printk(KERN_ERR "xen: unable to expand event array (%d)\n", ret);
> >> +	free_unused_array_pages();
> >> +	return ret;
> >> +}
> >> +
> >> +static void fifo_bind_to_cpu(struct irq_info *info, int cpu)
> >> +{
> >> +	/* no-op */
> > 
> > Perhaps you can also say: "The event array is shared between all of the
> > guests VCPUs." (from design-E).
> 
> That's not relevant here.  This is a no-op because there is no guest
> side state necessary for tracking which VCPU an event channel is bound
> to -- this is all done by the per-VCPU queues.

Ah, gotcha.
> 
> >> +static void fifo_unmask(int port)
> >> +{
> >> +	unsigned int cpu = get_cpu();
> >> +	bool do_hypercall = false;
> >> +	bool evtchn_pending = false;
> >> +
> >> +	BUG_ON(!irqs_disabled());
> >> +
> >> +	if (unlikely((cpu != cpu_from_evtchn(port))))
> >> +		do_hypercall = true;
> > 
> > Since the event array is shared across all of the vCPUs is this still
> > needed?
> 
> The shared event array isn't relevant (the 2-level ABI also has shared
> pending and mask arrays).
> 
> But, since a hypercall is always required if an event is pending then we
> can omit the above check -- it's not possible (as in the 2-level case)
> to optimize out the hypercall if we're already on the VCPU.
> 
> >> +static uint32_t clear_linked(volatile event_word_t *word)
> >> +{
> >> +    event_word_t n, o, w;
> > 
> > How about just 'new', 'old', 'temp' ?
> 
> The letters match the design doc.

Since you have to modify the design document anyway would it make
sense to alter it as well to have less cryptic variables?

> 
> >> +
> >> +static void fifo_handle_events(int cpu)
> >> +{
> >> +	struct evtchn_fifo_control_block *control_block;
> >> +	uint32_t ready;
> >> +	int q;
> >> +
> >> +	control_block = per_cpu(cpu_control_block, cpu);
> >> +
> >> +	ready = xchg(&control_block->ready, 0);
> >> +
> >> +	while (ready & EVTCHN_FIFO_READY_MASK) {
> >> +		q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES);
> >> +		consume_one_event(cpu, control_block, q, &ready);
> >> +		ready |= xchg(&control_block->ready, 0);
> > 
> > Say priority 0 is where VIRQ_TIMER is set and the TIMER is set to
> > be incredibly short.
> > 
> > Couldn't this cause a scenario where the guest clears the ready in
> > consume_one_event (clear_bit(0, BM(ready)) and the hypervisor
> > at the same time can set the bit. We then process the IRQ (virq_timer).
> > 
> > Then we get to the xchg here and end up with the ready having the
> > priority 0 bit set again. 
> > 
> > And then loop again and again..
> 
> Don't do that then.  This is no different to the behaviour of hardware
> interrupts.

I was thinking there might be some mitigation techinque - and there is.

The Linux kernel would disable the IRQ, which hopefully means that this
defective port ends up being masked.


> 
> >> +		ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
> >> +						  &init_control);
> >> +		if (ret < 0)
> >> +			BUG();
> > 
> > So if this is run after migration to an older hypevisor won't we
> > blow up here? Shouldn't we just exit with an return and use the
> > old style classis events?
> 
> Migrating to older hypervisors has never been supported.  You can only
> go forwards.

OK.
> 
> >> +static int __cpuinit fifo_cpu_notification(struct notifier_block *self,
> >> +					   unsigned long action, void *hcpu)
> >> +{
> >> +	int cpu = (long)hcpu;
> >> +	int ret = 0;
> >> +
> >> +	switch (action) {
> >> +	case CPU_UP_PREPARE:
> >> +		if (!per_cpu(cpu_control_block, cpu))
> >> +			ret = fifo_init_control_block(cpu);
> >> +		break;
> >> +	default:
> >> +		break;
> >> +	}
> >> +	return ret < 0 ? NOTIFY_BAD : NOTIFY_OK;
> > 
> > I think you should return NOTIFY_OK.
> > 
> > Say you do this:
> > 
> > 1) Launch guest on a new hypervisor (which has FIFO ABI)
> > 2) Bring down all CPUs but one.
> > 3) Migrate to an older hypervisor (no FIFI ABI)
> > 4) Bring up all of the CPUs.
> > 
> > We shouldn't return NOTIFY_BAD b/c the hypervisor we are resuming
> > on is too old. We should just continue on without the FIFO mechanism
> > in place.
> 
> Again, migrating to an older hypervisor has never been supported.  Also,
> once we have switched to the FIFO-based ABI for one VCPU we cannot go
> back and it's better to fail to bring up the VCPU than have one that
> cannot receive events.

OK.
> 
> David

  reply	other threads:[~2013-09-24 17:04 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-13 16:59 (no subject) David Vrabel
2013-09-13 16:59 ` [PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn() David Vrabel
2013-09-24 14:37   ` Konrad Rzeszutek Wilk
2013-09-24 16:04     ` David Vrabel
2013-09-13 16:59 ` [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings() David Vrabel
2013-09-24 14:40   ` Konrad Rzeszutek Wilk
2013-09-24 16:06     ` David Vrabel
2013-09-24 16:49       ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 03/13] xen/events: introduce test_and_set_mask David Vrabel
2013-09-24 14:40   ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 04/13] xen/events: replace raw bit ops with functions David Vrabel
2013-09-24 14:41   ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 05/13] xen/events: move drivers/xen/events.c into drivers/xen/events/ David Vrabel
2013-09-13 16:59 ` [PATCH 06/13] xen/events: move 2-level specific code into its own file David Vrabel
2013-09-13 16:59 ` [PATCH 07/13] xen/events: add struct evtchn_ops for the low-level port operations David Vrabel
2013-09-24 14:44   ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 08/13] xen/events: allow setup of irq_info to fail David Vrabel
2013-09-24 14:47   ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 09/13] xen/events: add a evtchn_op for port setup David Vrabel
2013-09-24 14:47   ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated David Vrabel
2013-09-24 14:58   ` Konrad Rzeszutek Wilk
2013-09-26 10:53     ` David Vrabel
2013-09-26 12:51       ` Konrad Rzeszutek Wilk
2013-09-13 16:59 ` [PATCH 11/13] xen/events: add xen_evtchn_mask_all() David Vrabel
2013-09-24 14:58   ` Konrad Rzeszutek Wilk
2013-09-13 17:00 ` [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels David Vrabel
2013-09-24 15:06   ` Konrad Rzeszutek Wilk
2013-09-26 11:06     ` David Vrabel
2013-09-24 15:08   ` Konrad Rzeszutek Wilk
2013-09-24 16:11     ` David Vrabel
2013-09-24 16:51       ` Konrad Rzeszutek Wilk
2013-09-13 17:00 ` [PATCH 13/13] xen/events: use the FIFO-based ABI if available David Vrabel
2013-09-24 15:50   ` Konrad Rzeszutek Wilk
2013-09-24 16:48     ` David Vrabel
2013-09-24 17:04       ` Konrad Rzeszutek Wilk [this message]
2013-09-13 17:03 ` [RFC PATCHv3 00/12] Linux: FIFO-based event channel ABI David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130924170418.GA14139@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.