Re: [PATCH 16/16] xen/events: use the FIFO-based ABI if available

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: David Vrabel <david.vrabel@citrix.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>, xen-devel@lists.xen.org
Subject: Re: [PATCH 16/16] xen/events: use the FIFO-based ABI if available
Date: Tue, 15 Oct 2013 19:58:52 +0100	[thread overview]
Message-ID: <525D906C.5080401@citrix.com> (raw)
In-Reply-To: <525C4663.70907@oracle.com>

On 14/10/13 20:30, Boris Ostrovsky wrote:
> On 10/08/2013 08:49 AM, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> Implement all the event channel port ops for the FIFO-based ABI.
>>
>> If the hypervisor supports the FIFO-based ABI, enable it by
>> initializing the control block for the boot VCPU and subsequent VCPUs
>> as they are brought up and on resume.  The event array is expanded as
>> required when event ports are setup.
[...]
>> --- a/drivers/xen/events/events.c
>> +++ b/drivers/xen/events/events.c
[...]
>> @@ -1636,7 +1637,13 @@ void xen_callback_vector(void) {}
>>     void __init xen_init_IRQ(void)
>>   {
>> -    xen_evtchn_2l_init();
>> +    int ret;
>> +
>> +    ret = xen_evtchn_fifo_init();
>> +    if (ret < 0) {
>> +        printk(KERN_INFO "xen: falling back to n-level event channels");
>> +        xen_evtchn_2l_init();
>> +    }
> 
> Should we provide users with ability to choose which mechanism to use?
> Is there any advantage in staying with 2-level? Stability, I guess,
> would be one.

If someone can demonstrate a use case where 2-level is better then we
could consider an option.  I don't think we want options for new
software features just because they might be buggy.

>> --- /dev/null
>> +++ b/drivers/xen/events/events_fifo.c
[...]
>> +#define BM(w) ((unsigned long *)(w))
> 
> This could go into a header file (events_internal.h?) since 2-level uses
> it as well.

Although they look the same they're converting between different types.
 xen_ulong_t in the 2-level case and event_word_t in the fifo-based case
so I would prefer this to be local to both files.

>> +
>> +    if (i >= MAX_EVENT_ARRAY_PAGES)
>> +        return -EINVAL;
>> +
>> +    while (i >= event_array_pages) {
>> +        void *array_page;
>> +        struct evtchn_expand_array expand_array;
>> +
>> +        /* Might already have a page if we've resumed. */
>> +        array_page = event_array[event_array_pages];
>> +        if (!array_page) {
>> +            array_page = (void *)__get_free_page(GFP_KERNEL);
>> +            if (array_page == NULL)
>> +                goto error;
>> +            event_array[event_array_pages] = array_page;
>> +        }
>> +
>> +        /* Mask all events in this page before adding it. */
>> +        init_array_page(array_page);
>> +
>> +        expand_array.array_gfn = virt_to_mfn(array_page);
>> +
>> +        ret = HYPERVISOR_event_channel_op(EVTCHNOP_expand_array,
>> &expand_array);
>> +        if (ret < 0)
>> +            goto error;
>> +
>> +        event_array_pages++;
> 
> Should this increment happen in the 'if(!array_page)' clause?

No. event_array_pages is the number of pages Xen is aware of.  Note how
we zero it when resuming on a new domain with the FIFO-based ABI
initially disabled.

>> +    }
>> +    return 0;
>> +
>> +  error:
>> +    if (event_array_pages == 0)
>> +        panic("xen: unable to expand event array with initial page
>> (%d)\n", ret);
>> +    else
>> +        printk(KERN_ERR "xen: unable to expand event array (%d)\n",
>> ret);
>> +    free_unused_array_pages();
> 
> Do you need to clean up in the hypervisor as well?

There's noting to clean up in the hypervisor here.
free_unused_array_pages() is freeing array pages that Xen is not aware of.

>> +static void evtchn_fifo_mask(unsigned port)
>> +{
>> +    event_word_t *word = event_word_from_port(port);
>> +    if (word)
>> +        sync_set_bit(EVTCHN_FIFO_MASKED, BM(word));
> 
> You are testing 'word' here but not in the routines above (or below).

I think the test can be removed.  The common code used to try and mask
all events even if there were no array pages yet, but it doesn't do this
any more.

>> +}
>> +
>> +static void evtchn_fifo_unmask(unsigned port)
>> +{
>> +    event_word_t *word = event_word_from_port(port);
>> +
>> +    BUG_ON(!irqs_disabled());
>> +
>> +    sync_clear_bit(EVTCHN_FIFO_MASKED, BM(word));
>> +    if (sync_test_bit(EVTCHN_FIFO_PENDING, BM(word))) {
>> +        struct evtchn_unmask unmask = { .port = port };
>> +        (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
>> +    }
>> +}
> 
> 2-level unmasking is somewhat more elaborate, with it trying to avoid
> races on pending events. Is this not a concern here?

The 2-level unmask is trying to avoid doing a hypercall as an
optimization.  This optimization is not possible so the code here is
much simpler.

>> +    if (head == 0) {
>> +        rmb(); /* Ensure word is up-to-date before reading head. */
>> +        head = control_block->head[priority];
>> +    }
>> +
>> +    port = head;
>> +    word = event_word_from_port(port);
> 
> Do you need to check for 'word!=NULL'? You don't check it in
> clear_linked() (which is maybe where this should be done).

I don't think so.  The kernel trusts Xen to only set valid LINK fields.

>> +static void evtchn_fifo_resume(void)
>> +{
>> +    unsigned cpu;
>> +
>> +    for_each_possible_cpu(cpu) {
>> +        void *control_block = per_cpu(cpu_control_block, cpu);
>> +        struct evtchn_init_control init_control;
>> +        int ret;
>> +
>> +        if (!control_block)
>> +            continue;
>> +
>> +        /*
>> +         * If this CPU is offline, take the opportunity to
>> +         * free the control block while it is not being
>> +         * used.
>> +         */
>> +        if (!cpu_online(cpu)) {
>> +            free_page((unsigned long)control_block);
>> +            per_cpu(cpu_control_block, cpu) = NULL;
>> +            continue;
>> +        }
> 
> Have you tested offlining/onlining CPUs (lots of them)? I am asking
> because I see EVTCHNOP_init_control both here
> and in init_control_block() but I don't see anything that would "deinit"
> control block for which you are freeing the page above.

It's not possible to "deinit" a control block.  The hypervisor
deliberately doesn't provide an operation for this.

Note that evtchn_fifo_resume() is called when the guest is resumed in a
new domain which does not have any control blocks initialized yet. So,
in the case above, we're freeing a control block that Xen isn't aware of
yet.

>> +    int ret = 0;
>> +
>> +    switch (action) {
>> +    case CPU_UP_PREPARE:
>> +        if (!per_cpu(cpu_control_block, cpu))
>> +            ret = evtchn_fifo_init_control_block(cpu);
>> +        break;
>> +    default:
>> +        break;
>> +    }
> 
> What happens when you offline a CPU?

All the control blocks remain initialized, available for use when the
CPU is onlined again.  This is no different to the per-VCPU shared info.

This does all work fine[1].

David

[1] Once I fixed a recent bug I introduced into patch 10 which would
accidentally trash the IPIs/VIRQs for VCPU 0 instead of the offlined
VCPU.  Oops.

next prev parent reply	other threads:[~2013-10-15 18:58 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-08 12:48 [PATCHv6 00/16] Linux: FIFO-based event channel ABI David Vrabel
2013-10-08 12:49 ` [PATCH 01/16] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn() David Vrabel
2013-10-14 15:59   ` Boris Ostrovsky
2013-10-14 16:35     ` David Vrabel
2013-10-08 12:49 ` [PATCH 02/16] xen/events: remove unnecessary init_evtchn_cpu_bindings() David Vrabel
2013-10-08 12:49 ` [PATCH 03/16] xen/events: introduce test_and_set_mask David Vrabel
2013-10-08 12:49 ` [PATCH 04/16] xen/events: replace raw bit ops with functions David Vrabel
2013-10-14 16:30   ` Boris Ostrovsky
2013-10-14 16:43     ` David Vrabel
2013-10-08 12:49 ` [PATCH 05/16] xen/events: move drivers/xen/events.c into drivers/xen/events/ David Vrabel
2013-10-08 12:49 ` [PATCH 06/16] xen/events: move 2-level specific code into its own file David Vrabel
2013-10-14 16:50   ` Boris Ostrovsky
2013-10-14 16:53     ` David Vrabel
2013-10-08 12:49 ` [PATCH 07/16] xen/events: add struct evtchn_ops for the low-level port operations David Vrabel
2013-10-08 12:49 ` [PATCH 08/16] xen/events: allow setup of irq_info to fail David Vrabel
2013-10-14 17:26   ` Boris Ostrovsky
2013-10-15 19:20     ` David Vrabel
2013-10-08 12:49 ` [PATCH 09/16] xen/events: add a evtchn_op for port setup David Vrabel
2013-10-08 12:49 ` [PATCH 10/16] xen/events: Refactor evtchn_to_irq array to be dynamically allocated David Vrabel
2013-10-14 17:52   ` Boris Ostrovsky
2013-10-15 18:58     ` David Vrabel
2013-10-08 12:49 ` [PATCH 11/16] xen/events: add xen_evtchn_mask_all() David Vrabel
2013-10-08 12:49 ` [PATCH 12/16] xen/evtchn: support more than 4096 ports David Vrabel
2013-10-14 18:06   ` Boris Ostrovsky
2013-10-08 12:49 ` [PATCH 13/16] xen/events: Add the hypervisor interface for the FIFO-based event channels David Vrabel
2013-10-08 12:49 ` [PATCH 14/16] xen/events: allow event channel priority to be set David Vrabel
2013-10-08 12:49 ` [PATCH 15/16] xen/x86: set VIRQ_TIMER priority to maximum David Vrabel
2013-10-08 12:49 ` [PATCH 16/16] xen/events: use the FIFO-based ABI if available David Vrabel
2013-10-14 19:30   ` Boris Ostrovsky
2013-10-15 18:58     ` David Vrabel [this message]
2013-10-15 20:39       ` Boris Ostrovsky
2013-10-16  9:46         ` David Vrabel
2013-10-16 13:26           ` Boris Ostrovsky
2013-10-16 13:49             ` David Vrabel
2013-10-14 13:41 ` [PATCHv6 00/16] Linux: FIFO-based event channel ABI David Vrabel
2013-10-16 15:19   ` Ian Campbell
2013-10-16 15:36     ` David Vrabel
2013-10-16 15:38     ` Ian Jackson
  -- strict thread matches above, loose matches on Subject: below --
2013-11-11 16:12 [PATCHv9 " David Vrabel
2013-11-11 16:13 ` [PATCH 16/16] xen/events: use the FIFO-based ABI if available David Vrabel
2013-10-31 15:09 [PATCHv8 00/16] Linux: FIFO-based event channel ABI David Vrabel
2013-10-31 15:09 ` [PATCH 16/16] xen/events: use the FIFO-based ABI if available David Vrabel
2013-10-18 14:23 [PATCHv7 00/16] Linux: FIFO-based event channel ABI David Vrabel
2013-10-18 14:23 ` [PATCH 16/16] xen/events: use the FIFO-based ABI if available David Vrabel
2013-10-02 17:14 [PATCHv5 00/16] Linux: FIFO-based event channel ABI David Vrabel
2013-10-02 17:15 ` [PATCH 16/16] xen/events: use the FIFO-based ABI if available David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525D906C.5080401@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).