* [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
@ 2008-03-05 18:18 Isaku Yamahata
2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata
0 siblings, 1 reply; 7+ messages in thread
From: Isaku Yamahata @ 2008-03-05 18:18 UTC (permalink / raw)
To: linux-ia64
Hi. This patchset implements xen/ia64 domU support.
Qing He and Eddie Dong also has been woring on pv_ops so that
I want to discuss before going further and avoid duplicated work.
I suppose that Eddie will also post his own patch. So reviewing both
patches, we can reach to better pv_ops interface.
- I didn't changed the ia64 intrinsic paravirtulization abi from
the last post. Presumably it would be better to discuss with
the Eddie's patch.
- I implemented The basic portion of domU pv_ops.
They may need the interface refinement. Probably Eddie has
his own opinion.
- This time I dropped the patches which hasn't been pv_ops'fied yet
because they aren't changed from the last post.
You can also get the full source from
http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/
branch: xen-ia64-2008mar06
The patchset are organized as follows
- xen arch portability.
Generalizes x86 xen patches for ia64 support.
- some preliminary patches.
Make kernel paravirtualization friendly.
- introduce pv_ops and the definitions for native.
- basic helper functions for xen ia64 support.
- introduce the pv_ops instance for xen/ia64.
TODO:
- discuss and define intrinsic paravirtualization abi.
- discuss pv_ops.
- more pv_ops for domU
- mca/sal call
- timer
- gate page
- fsys
- support save/restore/live migration
- more clean ups
- remove unnecessary if (is_running_on_xen()).
- Free xen_ivt areas somehow. No waste kernel space
From Keith Owens idea.
Probably after defining ABI because this is just optimization.
- dom0
consider after finishing domU/ia64 merge.
Changes from take 2:
- many clean ups following to comments.
- clean up:assembly instruction macro.
- introduced pv_ops: pv_info, pv_init_ops, pv_iosapic_ops, pv_irq_ops.
Changes from take 1:
Single IVT source code. compile multitimes using assembler macros.
thanks,
Diffstat:
arch/ia64/Kconfig | 72 +++
arch/ia64/kernel/Makefile | 30 +-
arch/ia64/kernel/acpi.c | 4 +
arch/ia64/kernel/asm-offsets.c | 25 +
arch/ia64/kernel/entry.S | 568 +------------------
arch/ia64/kernel/head.S | 6 +
arch/ia64/kernel/inst_native.h | 183 ++++++
arch/ia64/kernel/inst_paravirt.h | 28 +
arch/ia64/kernel/iosapic.c | 43 +-
arch/ia64/kernel/irq_ia64.c | 21 +-
arch/ia64/kernel/ivt.S | 153 +++---
arch/ia64/kernel/minstate.h | 10 +-
arch/ia64/kernel/module.c | 32 +
arch/ia64/kernel/pal.S | 5 +-
arch/ia64/kernel/paravirt.c | 94 +++
arch/ia64/kernel/paravirt_alt.c | 118 ++++
arch/ia64/kernel/paravirt_core.c | 201 +++++++
arch/ia64/kernel/paravirt_entry.c | 99 ++++
arch/ia64/kernel/paravirt_nop.c | 49 ++
arch/ia64/kernel/paravirtentry.S | 37 ++
arch/ia64/kernel/setup.c | 14 +
arch/ia64/kernel/smpboot.c | 2 +
arch/ia64/kernel/switch_leave.S | 603 +++++++++++++++++++
arch/ia64/kernel/vmlinux.lds.S | 35 ++
arch/ia64/xen/Makefile | 9 +
arch/ia64/xen/hypercall.S | 141 +++++
arch/ia64/xen/hypervisor.c | 235 ++++++++
arch/ia64/xen/inst_xen.h | 503 ++++++++++++++++
arch/ia64/xen/irq_xen.c | 435 ++++++++++++++
arch/ia64/xen/irq_xen.h | 8 +
arch/ia64/xen/machvec.c | 4 +
arch/ia64/xen/paravirt_xen.c | 242 ++++++++
arch/ia64/xen/privops_asm.S | 221 +++++++
arch/ia64/xen/privops_c.c | 279 +++++++++
arch/ia64/xen/util.c | 101 ++++
arch/ia64/xen/xcom_asm.S | 27 +
arch/ia64/xen/xcom_hcall.c | 458 +++++++++++++++
arch/ia64/xen/xen_pv_ops.c | 319 ++++++++++
arch/ia64/xen/xencomm.c | 108 ++++
arch/ia64/xen/xenivt.S | 59 ++
arch/ia64/{kernel/minstate.h => xen/xenminstate.h} | 96 +---
arch/ia64/xen/xenpal.S | 76 +++
arch/ia64/xen/xensetup.S | 60 ++
arch/x86/xen/Makefile | 4 +-
arch/x86/xen/grant-table.c | 91 +++
arch/x86/xen/xen-ops.h | 2 +-
drivers/xen/Makefile | 3 +-
{arch/x86 => drivers}/xen/events.c | 33 +-
{arch/x86 => drivers}/xen/features.c | 0
drivers/xen/grant-table.c | 37 +--
drivers/xen/xenbus/xenbus_client.c | 6 +-
drivers/xen/xencomm.c | 232 ++++++++
include/asm-ia64/gcc_intrin.h | 58 +-
include/asm-ia64/hw_irq.h | 24 +-
include/asm-ia64/intel_intrin.h | 64 +-
include/asm-ia64/intrinsics.h | 12 +
include/asm-ia64/iosapic.h | 18 +-
include/asm-ia64/irq.h | 33 ++
include/asm-ia64/machvec.h | 2 +
include/asm-ia64/machvec_xen.h | 22 +
include/asm-ia64/meminit.h | 3 +-
include/asm-ia64/mmu_context.h | 6 +-
include/asm-ia64/module.h | 6 +
include/asm-ia64/page.h | 8 +
include/asm-ia64/paravirt.h | 284 +++++++++
include/asm-ia64/paravirt_alt.h | 82 +++
include/asm-ia64/paravirt_core.h | 54 ++
include/asm-ia64/paravirt_entry.h | 62 ++
include/asm-ia64/paravirt_nop.h | 46 ++
include/asm-ia64/privop.h | 67 +++
include/asm-ia64/privop_paravirt.h | 587 +++++++++++++++++++
include/asm-ia64/sync_bitops.h | 59 ++
include/asm-ia64/system.h | 4 +-
include/asm-ia64/xen/hypercall.h | 426 ++++++++++++++
include/asm-ia64/xen/hypervisor.h | 249 ++++++++
include/asm-ia64/xen/interface.h | 585 +++++++++++++++++++
include/asm-ia64/xen/page.h | 41 ++
include/asm-ia64/xen/privop.h | 609 ++++++++++++++++++++
include/asm-ia64/xen/xcom_hcall.h | 55 ++
include/asm-ia64/xen/xencomm.h | 33 ++
include/asm-x86/xen/hypervisor.h | 10 +
include/asm-x86/xen/interface.h | 24 +
include/{ => asm-x86}/xen/page.h | 0
include/xen/events.h | 1 +
include/xen/grant_table.h | 6 +
include/xen/interface/callback.h | 119 ++++
include/xen/interface/grant_table.h | 11 +-
include/xen/interface/vcpu.h | 5 +
include/xen/interface/xen.h | 22 +-
include/xen/interface/xencomm.h | 41 ++
include/xen/page.h | 181 +------
include/xen/xen-ops.h | 6 +
include/xen/xencomm.h | 77 +++
93 files changed, 9174 insertions(+), 1049 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops. @ 2008-03-05 18:19 ` Isaku Yamahata 2008-03-20 9:13 ` Xen common code across architecture Dong, Eddie 0 siblings, 1 reply; 7+ messages in thread From: Isaku Yamahata @ 2008-03-05 18:19 UTC (permalink / raw) To: linux-ia64 Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> --- arch/ia64/xen/Makefile | 2 +- arch/ia64/xen/hypercall.S | 10 + arch/ia64/xen/irq_xen.c | 435 ++++++++++++++++++++++++++++++++++++++++++++ arch/ia64/xen/irq_xen.h | 8 + arch/ia64/xen/xen_pv_ops.c | 3 + include/asm-ia64/hw_irq.h | 4 + include/asm-ia64/irq.h | 33 ++++ 7 files changed, 494 insertions(+), 1 deletions(-) create mode 100644 arch/ia64/xen/irq_xen.c create mode 100644 arch/ia64/xen/irq_xen.h diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile index 4b1db56..ff7a58d 100644 --- a/arch/ia64/xen/Makefile +++ b/arch/ia64/xen/Makefile @@ -2,7 +2,7 @@ # Makefile for Xen components # -obj-y := xen_pv_ops.o +obj-y := xen_pv_ops.o irq_xen.o obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S index 7c5242b..3fad2fe 100644 --- a/arch/ia64/xen/hypercall.S +++ b/arch/ia64/xen/hypercall.S @@ -123,6 +123,16 @@ END(xen_set_eflag) #endif /* CONFIG_IA32_SUPPORT */ #endif /* ASM_SUPPORTED */ +GLOBAL_ENTRY(xen_send_ipi) + mov r14=r32 + mov r15=r33 + mov r2=0x400 + break 0x1000 + ;; + br.ret.sptk.many rp + ;; +END(xen_send_ipi) + GLOBAL_ENTRY(__hypercall) mov r2=r37 break 0x1000 diff --git a/arch/ia64/xen/irq_xen.c b/arch/ia64/xen/irq_xen.c new file mode 100644 index 0000000..57fab2b --- /dev/null +++ b/arch/ia64/xen/irq_xen.c @@ -0,0 +1,435 @@ +/****************************************************************************** + * arch/ia64/xen/irq_xen.c + * + * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <linux/cpu.h> + +#include <xen/events.h> +#include <xen/interface/callback.h> + +#include "irq_xen.h" + +/*************************************************************************** + * pv_irq_ops + * irq operations + */ + +static int +xen_assign_irq_vector(int irq) +{ + struct physdev_irq irq_op; + + irq_op.irq = irq; + if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) + return -ENOSPC; + + return irq_op.vector; +} + +static void +xen_free_irq_vector(int vector) +{ + struct physdev_irq irq_op; + + if (vector < IA64_FIRST_DEVICE_VECTOR || + vector > IA64_LAST_DEVICE_VECTOR) + return; + + irq_op.vector = vector; + if (HYPERVISOR_physdev_op(PHYSDEVOP_free_irq_vector, &irq_op)) + printk(KERN_WARNING "%s: xen_free_irq_vecotr fail vector=%d\n", + __func__, vector); +} + + +static DEFINE_PER_CPU(int, timer_irq) = -1; +static DEFINE_PER_CPU(int, ipi_irq) = -1; +static DEFINE_PER_CPU(int, resched_irq) = -1; +static DEFINE_PER_CPU(int, cmc_irq) = -1; +static DEFINE_PER_CPU(int, cmcp_irq) = -1; +static DEFINE_PER_CPU(int, cpep_irq) = -1; +#define NAME_SIZE 15 +static DEFINE_PER_CPU(char[NAME_SIZE], timer_name); +static DEFINE_PER_CPU(char[NAME_SIZE], ipi_name); +static DEFINE_PER_CPU(char[NAME_SIZE], resched_name); +static DEFINE_PER_CPU(char[NAME_SIZE], cmc_name); +static DEFINE_PER_CPU(char[NAME_SIZE], cmcp_name); +static DEFINE_PER_CPU(char[NAME_SIZE], cpep_name); +#undef NAME_SIZE + +struct saved_irq { + unsigned int irq; + struct irqaction *action; +}; +/* 16 should be far optimistic value, since only several percpu irqs + * are registered early. + */ +#define MAX_LATE_IRQ 16 +static struct saved_irq saved_percpu_irqs[MAX_LATE_IRQ]; +static unsigned short late_irq_cnt = 0; +static unsigned short saved_irq_cnt = 0; +static int xen_slab_ready = 0; + +#ifdef CONFIG_SMP +/* Dummy stub. Though we may check RESCHEDULE_VECTOR before __do_IRQ, + * it ends up to issue several memory accesses upon percpu data and + * thus adds unnecessary traffic to other paths. + */ +static irqreturn_t +xen_dummy_handler(int irq, void *dev_id) +{ + + return IRQ_HANDLED; +} + +static struct irqaction xen_resched_irqaction = { + .handler = xen_dummy_handler, + .flags = IRQF_DISABLED, + .name = "resched" +}; + +static struct irqaction xen_tlb_irqaction = { + .handler = xen_dummy_handler, + .flags = IRQF_DISABLED, + .name = "tlb_flush" +}; +#endif + +/* + * This is xen version percpu irq registration, which needs bind + * to xen specific evtchn sub-system. One trick here is that xen + * evtchn binding interface depends on kmalloc because related + * port needs to be freed at device/cpu down. So we cache the + * registration on BSP before slab is ready and then deal them + * at later point. For rest instances happening after slab ready, + * we hook them to xen evtchn immediately. + * + * FIXME: MCA is not supported by far, and thus "nomca" boot param is + * required. + */ +static void +__xen_register_percpu_irq(unsigned int cpu, unsigned int vec, + struct irqaction *action, int save) +{ + irq_desc_t *desc; + int irq = 0; + + if (xen_slab_ready) { + switch (vec) { + case IA64_TIMER_VECTOR: + snprintf(per_cpu(timer_name, cpu), + sizeof(per_cpu(timer_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_virq_to_irqhandler(VIRQ_ITC, cpu, + action->handler, action->flags, + per_cpu(timer_name, cpu), action->dev_id); + per_cpu(timer_irq, cpu) = irq; + break; + case IA64_IPI_RESCHEDULE: + snprintf(per_cpu(resched_name, cpu), + sizeof(per_cpu(resched_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_ipi_to_irqhandler(RESCHEDULE_VECTOR, cpu, + action->handler, action->flags, + per_cpu(resched_name, cpu), action->dev_id); + per_cpu(resched_irq, cpu) = irq; + break; + case IA64_IPI_VECTOR: + snprintf(per_cpu(ipi_name, cpu), + sizeof(per_cpu(ipi_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_ipi_to_irqhandler(IPI_VECTOR, cpu, + action->handler, action->flags, + per_cpu(ipi_name, cpu), action->dev_id); + per_cpu(ipi_irq, cpu) = irq; + break; + case IA64_CMC_VECTOR: + snprintf(per_cpu(cmc_name, cpu), + sizeof(per_cpu(cmc_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_virq_to_irqhandler(VIRQ_MCA_CMC, cpu, + action->handler, + action->flags, + per_cpu(cmc_name, cpu), + action->dev_id); + per_cpu(cmc_irq, cpu) = irq; + break; + case IA64_CMCP_VECTOR: + snprintf(per_cpu(cmcp_name, cpu), + sizeof(per_cpu(cmcp_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_ipi_to_irqhandler(CMCP_VECTOR, cpu, + action->handler, + action->flags, + per_cpu(cmcp_name, cpu), + action->dev_id); + per_cpu(cmcp_irq, cpu) = irq; + break; + case IA64_CPEP_VECTOR: + snprintf(per_cpu(cpep_name, cpu), + sizeof(per_cpu(cpep_name, cpu)), + "%s%d", action->name, cpu); + irq = bind_ipi_to_irqhandler(CPEP_VECTOR, cpu, + action->handler, + action->flags, + per_cpu(cpep_name, cpu), + action->dev_id); + per_cpu(cpep_irq, cpu) = irq; + break; + case IA64_CPE_VECTOR: + case IA64_MCA_RENDEZ_VECTOR: + case IA64_PERFMON_VECTOR: + case IA64_MCA_WAKEUP_VECTOR: + case IA64_SPURIOUS_INT_VECTOR: + /* No need to complain, these aren't supported. */ + break; + default: + printk(KERN_WARNING "Percpu irq %d is unsupported " + "by xen!\n", vec); + break; + } + BUG_ON(irq < 0); + + if (irq > 0) { + /* + * Mark percpu. Without this, migrate_irqs() will + * mark the interrupt for migrations and trigger it + * on cpu hotplug. + */ + desc = irq_desc + irq; + desc->status |= IRQ_PER_CPU; + } + } + + /* For BSP, we cache registered percpu irqs, and then re-walk + * them when initializing APs + */ + if (!cpu && save) { + BUG_ON(saved_irq_cnt = MAX_LATE_IRQ); + saved_percpu_irqs[saved_irq_cnt].irq = vec; + saved_percpu_irqs[saved_irq_cnt].action = action; + saved_irq_cnt++; + if (!xen_slab_ready) + late_irq_cnt++; + } +} + +static void +xen_register_percpu_irq(ia64_vector vec, struct irqaction *action) +{ + __xen_register_percpu_irq(smp_processor_id(), vec, action, 1); +} + +static void +xen_bind_early_percpu_irq(void) +{ + int i; + + xen_slab_ready = 1; + /* There's no race when accessing this cached array, since only + * BSP will face with such step shortly + */ + for (i = 0; i < late_irq_cnt; i++) + __xen_register_percpu_irq(smp_processor_id(), + saved_percpu_irqs[i].irq, + saved_percpu_irqs[i].action, 0); +} + +/* FIXME: There's no obvious point to check whether slab is ready. So + * a hack is used here by utilizing a late time hook. + */ +extern void (*late_time_init)(void); +extern char xen_event_callback; +extern void xen_init_IRQ(void); + +#ifdef CONFIG_HOTPLUG_CPU +static int __devinit +unbind_evtchn_callback(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + + if (action = CPU_DEAD) { + /* Unregister evtchn. */ + if (per_cpu(cpep_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(cpep_irq, cpu), NULL); + per_cpu(cpep_irq, cpu) = -1; + } + if (per_cpu(cmcp_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(cmcp_irq, cpu), NULL); + per_cpu(cmcp_irq, cpu) = -1; + } + if (per_cpu(cmc_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(cmc_irq, cpu), NULL); + per_cpu(cmc_irq, cpu) = -1; + } + if (per_cpu(ipi_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(ipi_irq, cpu), NULL); + per_cpu(ipi_irq, cpu) = -1; + } + if (per_cpu(resched_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(resched_irq, cpu), + NULL); + per_cpu(resched_irq, cpu) = -1; + } + if (per_cpu(timer_irq, cpu) >= 0) { + unbind_from_irqhandler(per_cpu(timer_irq, cpu), NULL); + per_cpu(timer_irq, cpu) = -1; + } + } + return NOTIFY_OK; +} + +static struct notifier_block unbind_evtchn_notifier = { + .notifier_call = unbind_evtchn_callback, + .priority = 0 +}; +#endif + +DECLARE_PER_CPU(int, ipi_to_irq[NR_IPIS]); +void xen_smp_intr_init_early(unsigned int cpu) +{ +#ifdef CONFIG_SMP + unsigned int i; + + for (i = 0; i < saved_irq_cnt; i++) + __xen_register_percpu_irq(cpu, saved_percpu_irqs[i].irq, + saved_percpu_irqs[i].action, 0); +#endif +} + +void xen_smp_intr_init(void) +{ +#ifdef CONFIG_SMP + unsigned int cpu = smp_processor_id(); + struct callback_register event = { + .type = CALLBACKTYPE_event, + .address = (unsigned long)&xen_event_callback, + }; + + if (cpu = 0) { + /* Initialization was already done for boot cpu. */ +#ifdef CONFIG_HOTPLUG_CPU + /* Register the notifier only once. */ + register_cpu_notifier(&unbind_evtchn_notifier); +#endif + return; + } + + /* This should be piggyback when setup vcpu guest context */ + BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event)); +#endif /* CONFIG_SMP */ +} + +void __init +xen_irq_init(void) +{ + struct callback_register event = { + .type = CALLBACKTYPE_event, + .address = (unsigned long)&xen_event_callback, + }; + + xen_init_IRQ(); + BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event)); + late_time_init = xen_bind_early_percpu_irq; +} + +void +xen_platform_send_ipi(int cpu, int vector, int delivery_mode, int redirect) +{ + int irq = -1; + +#ifdef CONFIG_SMP + /* TODO: we need to call vcpu_up here */ + if (unlikely(vector = ap_wakeup_vector)) { + /* XXX + * This should be in __cpu_up(cpu) in ia64 smpboot.c + * like x86. But don't want to modify it, + * keep it untouched. + */ + xen_smp_intr_init_early(cpu); + + xen_send_ipi(cpu, vector); + /* vcpu_prepare_and_up(cpu); */ + return; + } +#endif + + switch (vector) { + case IA64_IPI_VECTOR: + irq = per_cpu(ipi_to_irq, cpu)[IPI_VECTOR]; + break; + case IA64_IPI_RESCHEDULE: + irq = per_cpu(ipi_to_irq, cpu)[RESCHEDULE_VECTOR]; + break; + case IA64_CMCP_VECTOR: + irq = per_cpu(ipi_to_irq, cpu)[CMCP_VECTOR]; + break; + case IA64_CPEP_VECTOR: + irq = per_cpu(ipi_to_irq, cpu)[CPEP_VECTOR]; + break; + default: + printk(KERN_WARNING "Unsupported IPI type 0x%x\n", + vector); + irq = 0; + break; + } + + BUG_ON(irq < 0); + notify_remote_via_irq(irq); + return; +} + +static void __init +xen_init_IRQ_early(void) +{ +#ifdef CONFIG_SMP + register_percpu_irq(IA64_IPI_RESCHEDULE, &xen_resched_irqaction); + register_percpu_irq(IA64_IPI_LOCAL_TLB_FLUSH, &xen_tlb_irqaction); +#endif +} + +static void __init +xen_init_IRQ_late(void) +{ +#ifdef CONFIG_XEN_PRIVILEGED_GUEST + if (is_running_on_xen() && !ia64_platform_is("xen")) + xen_irq_init(); +#endif +} + +static void +xen_resend_irq(unsigned int vector) +{ + (void)resend_irq_on_evtchn(vector); +} + +const struct pv_irq_ops xen_irq_ops __initdata = { + .init_IRQ_early = xen_init_IRQ_early, + .init_IRQ_late = xen_init_IRQ_late, + + .assign_irq_vector = xen_assign_irq_vector, + .free_irq_vector = xen_free_irq_vector, + .register_percpu_irq = xen_register_percpu_irq, + + .send_ipi = xen_platform_send_ipi, + .resend_irq = xen_resend_irq, +}; diff --git a/arch/ia64/xen/irq_xen.h b/arch/ia64/xen/irq_xen.h new file mode 100644 index 0000000..a2c3ed9 --- /dev/null +++ b/arch/ia64/xen/irq_xen.h @@ -0,0 +1,8 @@ +#ifndef IRQ_XEN_H +#define IRQ_XEN_H + +extern const struct pv_irq_ops xen_irq_ops __initdata; +extern void xen_smp_intr_init(void); +extern void xen_send_ipi(int cpu, int vec); + +#endif /* IRQ_XEN_H */ diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c index c35bb23..93a5c64 100644 --- a/arch/ia64/xen/xen_pv_ops.c +++ b/arch/ia64/xen/xen_pv_ops.c @@ -35,6 +35,8 @@ #include <asm/xen/hypervisor.h> #include <asm/xen/xencomm.h> +#include "irq_xen.h" + /*************************************************************************** * general info */ @@ -313,4 +315,5 @@ xen_setup_pv_ops(void) pv_info = xen_info; pv_init_ops = xen_init_ops; pv_iosapic_ops = xen_iosapic_ops; + pv_irq_ops = xen_irq_ops; } diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h index 678efec..80009cd 100644 --- a/include/asm-ia64/hw_irq.h +++ b/include/asm-ia64/hw_irq.h @@ -15,7 +15,11 @@ #include <asm/ptrace.h> #include <asm/smp.h> +#ifndef CONFIG_XEN typedef u8 ia64_vector; +#else +typedef u16 ia64_vector; +#endif /* * 0 special diff --git a/include/asm-ia64/irq.h b/include/asm-ia64/irq.h index a66d268..aead249 100644 --- a/include/asm-ia64/irq.h +++ b/include/asm-ia64/irq.h @@ -14,6 +14,7 @@ #include <linux/types.h> #include <linux/cpumask.h> +#ifndef CONFIG_XEN #define NR_VECTORS 256 #if (NR_VECTORS + 32 * NR_CPUS) < 1024 @@ -21,6 +22,38 @@ #else #define NR_IRQS 1024 #endif +#else +/* + * The flat IRQ space is divided into two regions: + * 1. A one-to-one mapping of real physical IRQs. This space is only used + * if we have physical device-access privilege. This region is at the + * start of the IRQ space so that existing device drivers do not need + * to be modified to translate physical IRQ numbers into our IRQ space. + * 3. A dynamic mapping of inter-domain and Xen-sourced virtual IRQs. These + * are bound using the provided bind/unbind functions. + */ + +#define PIRQ_BASE 0 +#define NR_PIRQS 256 + +#define DYNIRQ_BASE (PIRQ_BASE + NR_PIRQS) +#define NR_DYNIRQS (CONFIG_NR_CPUS * 8) + +#define NR_IRQS (NR_PIRQS + NR_DYNIRQS) +#define NR_IRQ_VECTORS NR_IRQS + +#define pirq_to_irq(_x) ((_x) + PIRQ_BASE) +#define irq_to_pirq(_x) ((_x) - PIRQ_BASE) + +#define dynirq_to_irq(_x) ((_x) + DYNIRQ_BASE) +#define irq_to_dynirq(_x) ((_x) - DYNIRQ_BASE) + +#define RESCHEDULE_VECTOR 0 +#define IPI_VECTOR 1 +#define CMCP_VECTOR 2 +#define CPEP_VECTOR 3 +#define NR_IPIS 4 +#endif /* CONFIG_XEN */ static __inline__ int irq_canonicalize (int irq) -- 1.5.3 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Xen common code across architecture 2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata @ 2008-03-20 9:13 ` Dong, Eddie 2008-03-20 14:23 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 7+ messages in thread From: Dong, Eddie @ 2008-03-20 9:13 UTC (permalink / raw) To: virtualization, linux-kernel Cc: kvm-ia64-devel, xen-ia64-devel, linux-ia64, Andrew Morton Jeremy & all: Current xen kernel codes are in arch/x86/xen, but xen dynamic irqchip (events.c) are common for other architectures such as IA64. We are in progress with enabling pv_ops for IA64 now and want to reuse same code, do we need to move the code to some place common? suggestions? Thanks, eddie ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Xen common code across architecture 2008-03-20 9:13 ` Xen common code across architecture Dong, Eddie @ 2008-03-20 14:23 ` Jeremy Fitzhardinge 2008-03-25 6:13 ` Dong, Eddie 0 siblings, 1 reply; 7+ messages in thread From: Jeremy Fitzhardinge @ 2008-03-20 14:23 UTC (permalink / raw) To: Dong, Eddie Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64, kvm-ia64-devel, xen-ia64-devel Dong, Eddie wrote: > Current xen kernel codes are in arch/x86/xen, but xen dynamic > irqchip (events.c) are common for other architectures such as IA64. We > are in progress with enabling pv_ops for IA64 now and want to reuse same > code, do we need to move the code to some place common? suggestions? I'm fine with moving common stuff like that to drivers/xen/. J ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Xen common code across architecture 2008-03-20 14:23 ` Jeremy Fitzhardinge @ 2008-03-25 6:13 ` Dong, Eddie 2008-03-25 6:35 ` Dong, Eddie 2008-03-25 15:08 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 7+ messages in thread From: Dong, Eddie @ 2008-03-25 6:13 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64, kvm-ia64-devel, xen-ia64-devel [-- Attachment #1: Type: text/plain, Size: 32583 bytes --] Jeremy/Andrew: Isaku Yamahata, I and some other IA64/Xen community memebers are working together to enable pv_ops for IA64 Linux. This patch is a preparation to move common arch/x86/xen/events.c to drivers/xen (contents are identical) against mm tree, it is based on Yamahata's IA64/pv_ops patch serie. In case you want to have a brief view of whole pv_ops/IA64 patch serie, please refer to IA64 Linux mailinglist. Thanks, Eddie Move events.c to drivers/xen for IA64/Xen support. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c --- old/arch/x86/xen/events.c 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/events.c 1970-01-01 08:00:00.000000000 +0800 @@ -1,591 +0,0 @@ -/* - * Xen event channels - * - * Xen models interrupts with abstract event channels. Because each - * domain gets 1024 event channels, but NR_IRQ is not that large, we - * must dynamically map irqs<->event channels. The event channels - * interface with the rest of the kernel by defining a xen interrupt - * chip. When an event is recieved, it is mapped to an irq and sent - * through the normal interrupt processing path. - * - * There are four kinds of events which can be mapped to an event - * channel: - * - * 1. Inter-domain notifications. This includes all the virtual - * device events, since they're driven by front-ends in another domain - * (typically dom0). - * 2. VIRQs, typically used for timers. These are per-cpu events. - * 3. IPIs. - * 4. Hardware interrupts. Not supported at present. - * - * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 - */ - -#include <linux/linkage.h> -#include <linux/interrupt.h> -#include <linux/irq.h> -#include <linux/module.h> -#include <linux/string.h> - -#include <asm/ptrace.h> -#include <asm/irq.h> -#include <asm/sync_bitops.h> -#include <asm/xen/hypercall.h> -#include <asm/xen/hypervisor.h> - -#include <xen/events.h> -#include <xen/interface/xen.h> -#include <xen/interface/event_channel.h> - -#include "xen-ops.h" - -/* - * This lock protects updates to the following mapping and reference-count - * arrays. The lock does not need to be acquired to read the mapping tables. - */ -static DEFINE_SPINLOCK(irq_mapping_update_lock); - -/* IRQ <-> VIRQ mapping. */ -static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; - -/* IRQ <-> IPI mapping */ -static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; - -/* Packed IRQ information: binding type, sub-type index, and event channel. */ -struct packed_irq -{ - unsigned short evtchn; - unsigned char index; - unsigned char type; -}; - -static struct packed_irq irq_info[NR_IRQS]; - -/* Binding types. */ -enum { - IRQT_UNBOUND, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_EVTCHN -}; - -/* Convenient shorthand for packed representation of an unbound IRQ. */ -#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) - -static int evtchn_to_irq[NR_EVENT_CHANNELS] = { - [0 ... NR_EVENT_CHANNELS-1] = -1 -}; -static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; -static u8 cpu_evtchn[NR_EVENT_CHANNELS]; - -/* Reference counts for bindings to IRQs. */ -static int irq_bindcount[NR_IRQS]; - -/* Xen will never allocate port zero for any purpose. */ -#define VALID_EVTCHN(chn) ((chn) != 0) - -/* - * Force a proper event-channel callback from Xen after clearing the - * callback mask. We do this in a very simple manner, by making a call - * down into Xen. The pending flag will be checked by Xen on return. - */ -void force_evtchn_callback(void) -{ - (void)HYPERVISOR_xen_version(0, NULL); -} -EXPORT_SYMBOL_GPL(force_evtchn_callback); - -static struct irq_chip xen_dynamic_chip; - -/* Constructor for packed IRQ information. */ -static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) -{ - return (struct packed_irq) { evtchn, index, type }; -} - -/* - * Accessors for packed IRQ information. - */ -static inline unsigned int evtchn_from_irq(int irq) -{ - return irq_info[irq].evtchn; -} - -static inline unsigned int index_from_irq(int irq) -{ - return irq_info[irq].index; -} - -static inline unsigned int type_from_irq(int irq) -{ - return irq_info[irq].type; -} - -static inline unsigned long active_evtchns(unsigned int cpu, - struct shared_info *sh, - unsigned int idx) -{ - return (sh->evtchn_pending[idx] & - cpu_evtchn_mask[cpu][idx] & - ~sh->evtchn_mask[idx]); -} - -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) -{ - int irq = evtchn_to_irq[chn]; - - BUG_ON(irq == -1); -#ifdef CONFIG_SMP - irq_desc[irq].affinity = cpumask_of_cpu(cpu); -#endif - - __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); - __set_bit(chn, cpu_evtchn_mask[cpu]); - - cpu_evtchn[chn] = cpu; -} - -static void init_evtchn_cpu_bindings(void) -{ -#ifdef CONFIG_SMP - int i; - /* By default all event channels notify CPU#0. */ - for (i = 0; i < NR_IRQS; i++) - irq_desc[i].affinity = cpumask_of_cpu(0); -#endif - - memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); - memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); -} - -static inline unsigned int cpu_from_evtchn(unsigned int evtchn) -{ - return cpu_evtchn[evtchn]; -} - -static inline void clear_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, &s->evtchn_pending[0]); -} - -static inline void set_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_pending[0]); -} - - -/** - * notify_remote_via_irq - send event to remote end of event channel via irq - * @irq: irq of event channel to send event to - * - * Unlike notify_remote_via_evtchn(), this is safe to use across - * save/restore. Notifications on a broken connection are silently - * dropped. - */ -void notify_remote_via_irq(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - notify_remote_via_evtchn(evtchn); -} -EXPORT_SYMBOL_GPL(notify_remote_via_irq); - -static void mask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_mask[0]); -} - -static void unmask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - unsigned int cpu = get_cpu(); - - BUG_ON(!irqs_disabled()); - - /* Slow path (hypercall) if this is a non-local port. */ - if (unlikely(cpu != cpu_from_evtchn(port))) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - } else { - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - - sync_clear_bit(port, &s->evtchn_mask[0]); - - /* - * The following is basically the equivalent of - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose - * the interrupt edge' if the channel is masked. - */ - if (sync_test_bit(port, &s->evtchn_pending[0]) && - !sync_test_and_set_bit(port / BITS_PER_LONG, - &vcpu_info->evtchn_pending_sel)) - vcpu_info->evtchn_upcall_pending = 1; - } - - put_cpu(); -} - -static int find_unbound_irq(void) -{ - int irq; - - /* Only allocate from dynirq range */ - for (irq = 0; irq < NR_IRQS; irq++) - if (irq_bindcount[irq] == 0) - break; - - if (irq == NR_IRQS) - panic("No available IRQ to bind to: increase NR_IRQS!\n"); - - return irq; -} - -int bind_evtchn_to_irq(unsigned int evtchn) -{ - int irq; - - spin_lock(&irq_mapping_update_lock); - - irq = evtchn_to_irq[evtchn]; - - if (irq == -1) { - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "event"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); - -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) -{ - struct evtchn_bind_ipi bind_ipi; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(ipi_to_irq, cpu)[ipi]; - if (irq == -1) { - irq = find_unbound_irq(); - if (irq < 0) - goto out; - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "ipi"); - - bind_ipi.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, - &bind_ipi) != 0) - BUG(); - evtchn = bind_ipi.port; - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); - - per_cpu(ipi_to_irq, cpu)[ipi] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - out: - spin_unlock(&irq_mapping_update_lock); - return irq; -} - - -static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) -{ - struct evtchn_bind_virq bind_virq; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(virq_to_irq, cpu)[virq]; - - if (irq == -1) { - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, - &bind_virq) != 0) - BUG(); - evtchn = bind_virq.port; - - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "virq"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); - - per_cpu(virq_to_irq, cpu)[virq] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} - -static void unbind_from_irq(unsigned int irq) -{ - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - - spin_lock(&irq_mapping_update_lock); - - if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [index_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - irq_info[irq] = IRQ_UNBOUND; - - dynamic_irq_init(irq); - } - - spin_unlock(&irq_mapping_update_lock); -} - -int bind_evtchn_to_irqhandler(unsigned int evtchn, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_evtchn_to_irq(evtchn); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); - -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_virq_to_irq(virq, cpu); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); - -int bind_ipi_to_irqhandler(enum ipi_vector ipi, - unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, - void *dev_id) -{ - int irq, retval; - - irq = bind_ipi_to_irq(ipi, cpu); - if (irq < 0) - return irq; - - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} - -void unbind_from_irqhandler(unsigned int irq, void *dev_id) -{ - free_irq(irq, dev_id); - unbind_from_irq(irq); -} -EXPORT_SYMBOL_GPL(unbind_from_irqhandler); - -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) -{ - int irq = per_cpu(ipi_to_irq, cpu)[vector]; - BUG_ON(irq < 0); - notify_remote_via_irq(irq); -} - - -/* - * Search the CPUs pending events bitmasks. For each one found, map - * the event number to an irq, and feed it into do_IRQ() for - * handling. - * - * Xen uses a two-level bitmap to speed searching. The first level is - * a bitset of words which contain pending event bits. The second - * level is a bitset of pending events themselves. - */ -void xen_evtchn_do_upcall(struct pt_regs *regs) -{ - int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - unsigned long pending_words; - - vcpu_info->evtchn_upcall_pending = 0; - - /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ - pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); - while (pending_words != 0) { - unsigned long pending_bits; - int word_idx = __ffs(pending_words); - pending_words &= ~(1UL << word_idx); - - while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { - int bit_idx = __ffs(pending_bits); - int port = (word_idx * BITS_PER_LONG) + bit_idx; - int irq = evtchn_to_irq[port]; - - if (irq != -1) { - regs->orig_ax = ~irq; - do_IRQ(regs); - } - } - } - - put_cpu(); -} - -/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) -{ - struct evtchn_bind_vcpu bind_vcpu; - int evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - return; - - /* Send future instances of this interrupt to other vcpu. */ - bind_vcpu.port = evtchn; - bind_vcpu.vcpu = tcpu; - - /* - * If this fails, it usually just indicates that we're dealing with a - * virq or IPI channel, which don't actually need to be rebound. Ignore - * it, but don't do the xenlinux-level rebind in that case. - */ - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) - bind_evtchn_to_cpu(evtchn, tcpu); -} - - -static void set_affinity_irq(unsigned irq, cpumask_t dest) -{ - unsigned tcpu = first_cpu(dest); - rebind_irq_to_cpu(irq, tcpu); -} - -static void enable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - unmask_evtchn(evtchn); -} - -static void disable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - mask_evtchn(evtchn); -} - -static void ack_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - move_native_irq(irq); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); -} - -static int retrigger_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - int ret = 0; - - if (VALID_EVTCHN(evtchn)) { - set_evtchn(evtchn); - ret = 1; - } - - return ret; -} - -static struct irq_chip xen_dynamic_chip __read_mostly = { - .name = "xen-dyn", - .mask = disable_dynirq, - .unmask = enable_dynirq, - .ack = ack_dynirq, - .set_affinity = set_affinity_irq, - .retrigger = retrigger_dynirq, -}; - -void __init xen_init_IRQ(void) -{ - int i; - - init_evtchn_cpu_bindings(); - - /* No event channels are 'live' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) - mask_evtchn(i); - - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ - for (i = 0; i < NR_IRQS; i++) - irq_bindcount[i] = 0; - - irq_ctx_init(smp_processor_id()); -} diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile --- old/arch/x86/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/Makefile 2008-03-25 13:56:41.367764448 +0800 @@ -1,4 +1,4 @@ obj-y := enlighten.o setup.o features.o multicalls.o mmu.o \ - events.o time.o manage.o xen-asm.o + time.o manage.o xen-asm.o obj-$(CONFIG_SMP) += smp.o diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h --- old/arch/x86/xen/xen-ops.h 2008-03-25 13:21:09.996527604 +0800 +++ linux/arch/x86/xen/xen-ops.h 2008-03-25 13:59:16.349809137 +0800 @@ -2,6 +2,7 @@ #define XEN_OPS_H #include <linux/init.h> +#include <xen/xen-ops.h> /* These are code, but not functions. Defined in entry.S */ extern const char xen_hypervisor_callback[]; @@ -9,7 +10,6 @@ void xen_copy_trap_info(struct trap_info *traps); -DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); DECLARE_PER_CPU(unsigned long, xen_cr3); DECLARE_PER_CPU(unsigned long, xen_current_cr3); diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c --- old/drivers/xen/events.c 1970-01-01 08:00:00.000000000 +0800 +++ linux/drivers/xen/events.c 2008-03-25 13:56:41.368764287 +0800 @@ -0,0 +1,591 @@ +/* + * Xen event channels + * + * Xen models interrupts with abstract event channels. Because each + * domain gets 1024 event channels, but NR_IRQ is not that large, we + * must dynamically map irqs<->event channels. The event channels + * interface with the rest of the kernel by defining a xen interrupt + * chip. When an event is recieved, it is mapped to an irq and sent + * through the normal interrupt processing path. + * + * There are four kinds of events which can be mapped to an event + * channel: + * + * 1. Inter-domain notifications. This includes all the virtual + * device events, since they're driven by front-ends in another domain + * (typically dom0). + * 2. VIRQs, typically used for timers. These are per-cpu events. + * 3. IPIs. + * 4. Hardware interrupts. Not supported at present. + * + * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 + */ + +#include <linux/linkage.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/string.h> + +#include <asm/ptrace.h> +#include <asm/irq.h> +#include <asm/sync_bitops.h> +#include <asm/xen/hypercall.h> +#include <asm/xen/hypervisor.h> + +#include <xen/events.h> +#include <xen/interface/xen.h> +#include <xen/interface/event_channel.h> + +#include "xen-ops.h" + +/* + * This lock protects updates to the following mapping and reference-count + * arrays. The lock does not need to be acquired to read the mapping tables. + */ +static DEFINE_SPINLOCK(irq_mapping_update_lock); + +/* IRQ <-> VIRQ mapping. */ +static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; + +/* IRQ <-> IPI mapping */ +static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; + +/* Packed IRQ information: binding type, sub-type index, and event channel. */ +struct packed_irq +{ + unsigned short evtchn; + unsigned char index; + unsigned char type; +}; + +static struct packed_irq irq_info[NR_IRQS]; + +/* Binding types. */ +enum { + IRQT_UNBOUND, + IRQT_PIRQ, + IRQT_VIRQ, + IRQT_IPI, + IRQT_EVTCHN +}; + +/* Convenient shorthand for packed representation of an unbound IRQ. */ +#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) + +static int evtchn_to_irq[NR_EVENT_CHANNELS] = { + [0 ... NR_EVENT_CHANNELS-1] = -1 +}; +static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; +static u8 cpu_evtchn[NR_EVENT_CHANNELS]; + +/* Reference counts for bindings to IRQs. */ +static int irq_bindcount[NR_IRQS]; + +/* Xen will never allocate port zero for any purpose. */ +#define VALID_EVTCHN(chn) ((chn) != 0) + +/* + * Force a proper event-channel callback from Xen after clearing the + * callback mask. We do this in a very simple manner, by making a call + * down into Xen. The pending flag will be checked by Xen on return. + */ +void force_evtchn_callback(void) +{ + (void)HYPERVISOR_xen_version(0, NULL); +} +EXPORT_SYMBOL_GPL(force_evtchn_callback); + +static struct irq_chip xen_dynamic_chip; + +/* Constructor for packed IRQ information. */ +static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) +{ + return (struct packed_irq) { evtchn, index, type }; +} + +/* + * Accessors for packed IRQ information. + */ +static inline unsigned int evtchn_from_irq(int irq) +{ + return irq_info[irq].evtchn; +} + +static inline unsigned int index_from_irq(int irq) +{ + return irq_info[irq].index; +} + +static inline unsigned int type_from_irq(int irq) +{ + return irq_info[irq].type; +} + +static inline unsigned long active_evtchns(unsigned int cpu, + struct shared_info *sh, + unsigned int idx) +{ + return (sh->evtchn_pending[idx] & + cpu_evtchn_mask[cpu][idx] & + ~sh->evtchn_mask[idx]); +} + +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) +{ + int irq = evtchn_to_irq[chn]; + + BUG_ON(irq == -1); +#ifdef CONFIG_SMP + irq_desc[irq].affinity = cpumask_of_cpu(cpu); +#endif + + __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); + __set_bit(chn, cpu_evtchn_mask[cpu]); + + cpu_evtchn[chn] = cpu; +} + +static void init_evtchn_cpu_bindings(void) +{ +#ifdef CONFIG_SMP + int i; + /* By default all event channels notify CPU#0. */ + for (i = 0; i < NR_IRQS; i++) + irq_desc[i].affinity = cpumask_of_cpu(0); +#endif + + memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); + memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); +} + +static inline unsigned int cpu_from_evtchn(unsigned int evtchn) +{ + return cpu_evtchn[evtchn]; +} + +static inline void clear_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_clear_bit(port, &s->evtchn_pending[0]); +} + +static inline void set_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_pending[0]); +} + + +/** + * notify_remote_via_irq - send event to remote end of event channel via irq + * @irq: irq of event channel to send event to + * + * Unlike notify_remote_via_evtchn(), this is safe to use across + * save/restore. Notifications on a broken connection are silently + * dropped. + */ +void notify_remote_via_irq(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + notify_remote_via_evtchn(evtchn); +} +EXPORT_SYMBOL_GPL(notify_remote_via_irq); + +static void mask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_mask[0]); +} + +static void unmask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + unsigned int cpu = get_cpu(); + + BUG_ON(!irqs_disabled()); + + /* Slow path (hypercall) if this is a non-local port. */ + if (unlikely(cpu != cpu_from_evtchn(port))) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } else { + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + + sync_clear_bit(port, &s->evtchn_mask[0]); + + /* + * The following is basically the equivalent of + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose + * the interrupt edge' if the channel is masked. + */ + if (sync_test_bit(port, &s->evtchn_pending[0]) && + !sync_test_and_set_bit(port / BITS_PER_LONG, + &vcpu_info->evtchn_pending_sel)) + vcpu_info->evtchn_upcall_pending = 1; + } + + put_cpu(); +} + +static int find_unbound_irq(void) +{ + int irq; + + /* Only allocate from dynirq range */ + for (irq = 0; irq < NR_IRQS; irq++) + if (irq_bindcount[irq] == 0) + break; + + if (irq == NR_IRQS) + panic("No available IRQ to bind to: increase NR_IRQS!\n"); + + return irq; +} + +int bind_evtchn_to_irq(unsigned int evtchn) +{ + int irq; + + spin_lock(&irq_mapping_update_lock); + + irq = evtchn_to_irq[evtchn]; + + if (irq == -1) { + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "event"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); + +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) +{ + struct evtchn_bind_ipi bind_ipi; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(ipi_to_irq, cpu)[ipi]; + if (irq == -1) { + irq = find_unbound_irq(); + if (irq < 0) + goto out; + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "ipi"); + + bind_ipi.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, + &bind_ipi) != 0) + BUG(); + evtchn = bind_ipi.port; + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); + + per_cpu(ipi_to_irq, cpu)[ipi] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + out: + spin_unlock(&irq_mapping_update_lock); + return irq; +} + + +static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) +{ + struct evtchn_bind_virq bind_virq; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(virq_to_irq, cpu)[virq]; + + if (irq == -1) { + bind_virq.virq = virq; + bind_virq.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, + &bind_virq) != 0) + BUG(); + evtchn = bind_virq.port; + + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "virq"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); + + per_cpu(virq_to_irq, cpu)[virq] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} + +static void unbind_from_irq(unsigned int irq) +{ + struct evtchn_close close; + int evtchn = evtchn_from_irq(irq); + + spin_lock(&irq_mapping_update_lock); + + if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + switch (type_from_irq(irq)) { + case IRQT_VIRQ: + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) + [index_from_irq(irq)] = -1; + break; + default: + break; + } + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(evtchn, 0); + + evtchn_to_irq[evtchn] = -1; + irq_info[irq] = IRQ_UNBOUND; + + dynamic_irq_init(irq); + } + + spin_unlock(&irq_mapping_update_lock); +} + +int bind_evtchn_to_irqhandler(unsigned int evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_evtchn_to_irq(evtchn); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); + +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_virq_to_irq(virq, cpu); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); + +int bind_ipi_to_irqhandler(enum ipi_vector ipi, + unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + int irq, retval; + + irq = bind_ipi_to_irq(ipi, cpu); + if (irq < 0) + return irq; + + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} + +void unbind_from_irqhandler(unsigned int irq, void *dev_id) +{ + free_irq(irq, dev_id); + unbind_from_irq(irq); +} +EXPORT_SYMBOL_GPL(unbind_from_irqhandler); + +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) +{ + int irq = per_cpu(ipi_to_irq, cpu)[vector]; + BUG_ON(irq < 0); + notify_remote_via_irq(irq); +} + + +/* + * Search the CPUs pending events bitmasks. For each one found, map + * the event number to an irq, and feed it into do_IRQ() for + * handling. + * + * Xen uses a two-level bitmap to speed searching. The first level is + * a bitset of words which contain pending event bits. The second + * level is a bitset of pending events themselves. + */ +void xen_evtchn_do_upcall(struct pt_regs *regs) +{ + int cpu = get_cpu(); + struct shared_info *s = HYPERVISOR_shared_info; + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + unsigned long pending_words; + + vcpu_info->evtchn_upcall_pending = 0; + + /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ + pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); + while (pending_words != 0) { + unsigned long pending_bits; + int word_idx = __ffs(pending_words); + pending_words &= ~(1UL << word_idx); + + while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { + int bit_idx = __ffs(pending_bits); + int port = (word_idx * BITS_PER_LONG) + bit_idx; + int irq = evtchn_to_irq[port]; + + if (irq != -1) { + regs->orig_ax = ~irq; + do_IRQ(regs); + } + } + } + + put_cpu(); +} + +/* Rebind an evtchn so that it gets delivered to a specific cpu */ +static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) +{ + struct evtchn_bind_vcpu bind_vcpu; + int evtchn = evtchn_from_irq(irq); + + if (!VALID_EVTCHN(evtchn)) + return; + + /* Send future instances of this interrupt to other vcpu. */ + bind_vcpu.port = evtchn; + bind_vcpu.vcpu = tcpu; + + /* + * If this fails, it usually just indicates that we're dealing with a + * virq or IPI channel, which don't actually need to be rebound. Ignore + * it, but don't do the xenlinux-level rebind in that case. + */ + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) + bind_evtchn_to_cpu(evtchn, tcpu); +} + + +static void set_affinity_irq(unsigned irq, cpumask_t dest) +{ + unsigned tcpu = first_cpu(dest); + rebind_irq_to_cpu(irq, tcpu); +} + +static void enable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + unmask_evtchn(evtchn); +} + +static void disable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + mask_evtchn(evtchn); +} + +static void ack_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + move_native_irq(irq); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); +} + +static int retrigger_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + int ret = 0; + + if (VALID_EVTCHN(evtchn)) { + set_evtchn(evtchn); + ret = 1; + } + + return ret; +} + +static struct irq_chip xen_dynamic_chip __read_mostly = { + .name = "xen-dyn", + .mask = disable_dynirq, + .unmask = enable_dynirq, + .ack = ack_dynirq, + .set_affinity = set_affinity_irq, + .retrigger = retrigger_dynirq, +}; + +void __init xen_init_IRQ(void) +{ + int i; + + init_evtchn_cpu_bindings(); + + /* No event channels are 'live' right now. */ + for (i = 0; i < NR_EVENT_CHANNELS; i++) + mask_evtchn(i); + + /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ + for (i = 0; i < NR_IRQS; i++) + irq_bindcount[i] = 0; + + irq_ctx_init(smp_processor_id()); +} diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile --- old/drivers/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/drivers/xen/Makefile 2008-03-25 13:56:41.368764287 +0800 @@ -1,2 +1,2 @@ -obj-y += grant-table.o +obj-y += grant-table.o events.o obj-y += xenbus/ diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h --- old/include/xen/xen-ops.h 1970-01-01 08:00:00.000000000 +0800 +++ linux/include/xen/xen-ops.h 2008-03-25 14:00:09.041321546 +0800 @@ -0,0 +1,6 @@ +#ifndef INCLUDE_XEN_OPS_H +#define INCLUDE_XEN_OPS_H + +DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); + +#endif /* INCLUDE_XEN_OPS_H */ [-- Attachment #2: move_xenirq3.patch --] [-- Type: application/octet-stream, Size: 30860 bytes --] Move events.c to drivers/xen for IA64/Xen support. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c --- old/arch/x86/xen/events.c 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/events.c 1970-01-01 08:00:00.000000000 +0800 @@ -1,591 +0,0 @@ -/* - * Xen event channels - * - * Xen models interrupts with abstract event channels. Because each - * domain gets 1024 event channels, but NR_IRQ is not that large, we - * must dynamically map irqs<->event channels. The event channels - * interface with the rest of the kernel by defining a xen interrupt - * chip. When an event is recieved, it is mapped to an irq and sent - * through the normal interrupt processing path. - * - * There are four kinds of events which can be mapped to an event - * channel: - * - * 1. Inter-domain notifications. This includes all the virtual - * device events, since they're driven by front-ends in another domain - * (typically dom0). - * 2. VIRQs, typically used for timers. These are per-cpu events. - * 3. IPIs. - * 4. Hardware interrupts. Not supported at present. - * - * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 - */ - -#include <linux/linkage.h> -#include <linux/interrupt.h> -#include <linux/irq.h> -#include <linux/module.h> -#include <linux/string.h> - -#include <asm/ptrace.h> -#include <asm/irq.h> -#include <asm/sync_bitops.h> -#include <asm/xen/hypercall.h> -#include <asm/xen/hypervisor.h> - -#include <xen/events.h> -#include <xen/interface/xen.h> -#include <xen/interface/event_channel.h> - -#include "xen-ops.h" - -/* - * This lock protects updates to the following mapping and reference-count - * arrays. The lock does not need to be acquired to read the mapping tables. - */ -static DEFINE_SPINLOCK(irq_mapping_update_lock); - -/* IRQ <-> VIRQ mapping. */ -static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; - -/* IRQ <-> IPI mapping */ -static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; - -/* Packed IRQ information: binding type, sub-type index, and event channel. */ -struct packed_irq -{ - unsigned short evtchn; - unsigned char index; - unsigned char type; -}; - -static struct packed_irq irq_info[NR_IRQS]; - -/* Binding types. */ -enum { - IRQT_UNBOUND, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_EVTCHN -}; - -/* Convenient shorthand for packed representation of an unbound IRQ. */ -#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) - -static int evtchn_to_irq[NR_EVENT_CHANNELS] = { - [0 ... NR_EVENT_CHANNELS-1] = -1 -}; -static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; -static u8 cpu_evtchn[NR_EVENT_CHANNELS]; - -/* Reference counts for bindings to IRQs. */ -static int irq_bindcount[NR_IRQS]; - -/* Xen will never allocate port zero for any purpose. */ -#define VALID_EVTCHN(chn) ((chn) != 0) - -/* - * Force a proper event-channel callback from Xen after clearing the - * callback mask. We do this in a very simple manner, by making a call - * down into Xen. The pending flag will be checked by Xen on return. - */ -void force_evtchn_callback(void) -{ - (void)HYPERVISOR_xen_version(0, NULL); -} -EXPORT_SYMBOL_GPL(force_evtchn_callback); - -static struct irq_chip xen_dynamic_chip; - -/* Constructor for packed IRQ information. */ -static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) -{ - return (struct packed_irq) { evtchn, index, type }; -} - -/* - * Accessors for packed IRQ information. - */ -static inline unsigned int evtchn_from_irq(int irq) -{ - return irq_info[irq].evtchn; -} - -static inline unsigned int index_from_irq(int irq) -{ - return irq_info[irq].index; -} - -static inline unsigned int type_from_irq(int irq) -{ - return irq_info[irq].type; -} - -static inline unsigned long active_evtchns(unsigned int cpu, - struct shared_info *sh, - unsigned int idx) -{ - return (sh->evtchn_pending[idx] & - cpu_evtchn_mask[cpu][idx] & - ~sh->evtchn_mask[idx]); -} - -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) -{ - int irq = evtchn_to_irq[chn]; - - BUG_ON(irq == -1); -#ifdef CONFIG_SMP - irq_desc[irq].affinity = cpumask_of_cpu(cpu); -#endif - - __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); - __set_bit(chn, cpu_evtchn_mask[cpu]); - - cpu_evtchn[chn] = cpu; -} - -static void init_evtchn_cpu_bindings(void) -{ -#ifdef CONFIG_SMP - int i; - /* By default all event channels notify CPU#0. */ - for (i = 0; i < NR_IRQS; i++) - irq_desc[i].affinity = cpumask_of_cpu(0); -#endif - - memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); - memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); -} - -static inline unsigned int cpu_from_evtchn(unsigned int evtchn) -{ - return cpu_evtchn[evtchn]; -} - -static inline void clear_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, &s->evtchn_pending[0]); -} - -static inline void set_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_pending[0]); -} - - -/** - * notify_remote_via_irq - send event to remote end of event channel via irq - * @irq: irq of event channel to send event to - * - * Unlike notify_remote_via_evtchn(), this is safe to use across - * save/restore. Notifications on a broken connection are silently - * dropped. - */ -void notify_remote_via_irq(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - notify_remote_via_evtchn(evtchn); -} -EXPORT_SYMBOL_GPL(notify_remote_via_irq); - -static void mask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_mask[0]); -} - -static void unmask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - unsigned int cpu = get_cpu(); - - BUG_ON(!irqs_disabled()); - - /* Slow path (hypercall) if this is a non-local port. */ - if (unlikely(cpu != cpu_from_evtchn(port))) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - } else { - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - - sync_clear_bit(port, &s->evtchn_mask[0]); - - /* - * The following is basically the equivalent of - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose - * the interrupt edge' if the channel is masked. - */ - if (sync_test_bit(port, &s->evtchn_pending[0]) && - !sync_test_and_set_bit(port / BITS_PER_LONG, - &vcpu_info->evtchn_pending_sel)) - vcpu_info->evtchn_upcall_pending = 1; - } - - put_cpu(); -} - -static int find_unbound_irq(void) -{ - int irq; - - /* Only allocate from dynirq range */ - for (irq = 0; irq < NR_IRQS; irq++) - if (irq_bindcount[irq] == 0) - break; - - if (irq == NR_IRQS) - panic("No available IRQ to bind to: increase NR_IRQS!\n"); - - return irq; -} - -int bind_evtchn_to_irq(unsigned int evtchn) -{ - int irq; - - spin_lock(&irq_mapping_update_lock); - - irq = evtchn_to_irq[evtchn]; - - if (irq == -1) { - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "event"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); - -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) -{ - struct evtchn_bind_ipi bind_ipi; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(ipi_to_irq, cpu)[ipi]; - if (irq == -1) { - irq = find_unbound_irq(); - if (irq < 0) - goto out; - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "ipi"); - - bind_ipi.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, - &bind_ipi) != 0) - BUG(); - evtchn = bind_ipi.port; - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); - - per_cpu(ipi_to_irq, cpu)[ipi] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - out: - spin_unlock(&irq_mapping_update_lock); - return irq; -} - - -static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) -{ - struct evtchn_bind_virq bind_virq; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(virq_to_irq, cpu)[virq]; - - if (irq == -1) { - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, - &bind_virq) != 0) - BUG(); - evtchn = bind_virq.port; - - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "virq"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); - - per_cpu(virq_to_irq, cpu)[virq] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} - -static void unbind_from_irq(unsigned int irq) -{ - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - - spin_lock(&irq_mapping_update_lock); - - if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [index_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - irq_info[irq] = IRQ_UNBOUND; - - dynamic_irq_init(irq); - } - - spin_unlock(&irq_mapping_update_lock); -} - -int bind_evtchn_to_irqhandler(unsigned int evtchn, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_evtchn_to_irq(evtchn); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); - -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_virq_to_irq(virq, cpu); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); - -int bind_ipi_to_irqhandler(enum ipi_vector ipi, - unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, - void *dev_id) -{ - int irq, retval; - - irq = bind_ipi_to_irq(ipi, cpu); - if (irq < 0) - return irq; - - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} - -void unbind_from_irqhandler(unsigned int irq, void *dev_id) -{ - free_irq(irq, dev_id); - unbind_from_irq(irq); -} -EXPORT_SYMBOL_GPL(unbind_from_irqhandler); - -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) -{ - int irq = per_cpu(ipi_to_irq, cpu)[vector]; - BUG_ON(irq < 0); - notify_remote_via_irq(irq); -} - - -/* - * Search the CPUs pending events bitmasks. For each one found, map - * the event number to an irq, and feed it into do_IRQ() for - * handling. - * - * Xen uses a two-level bitmap to speed searching. The first level is - * a bitset of words which contain pending event bits. The second - * level is a bitset of pending events themselves. - */ -void xen_evtchn_do_upcall(struct pt_regs *regs) -{ - int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - unsigned long pending_words; - - vcpu_info->evtchn_upcall_pending = 0; - - /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ - pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); - while (pending_words != 0) { - unsigned long pending_bits; - int word_idx = __ffs(pending_words); - pending_words &= ~(1UL << word_idx); - - while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { - int bit_idx = __ffs(pending_bits); - int port = (word_idx * BITS_PER_LONG) + bit_idx; - int irq = evtchn_to_irq[port]; - - if (irq != -1) { - regs->orig_ax = ~irq; - do_IRQ(regs); - } - } - } - - put_cpu(); -} - -/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) -{ - struct evtchn_bind_vcpu bind_vcpu; - int evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - return; - - /* Send future instances of this interrupt to other vcpu. */ - bind_vcpu.port = evtchn; - bind_vcpu.vcpu = tcpu; - - /* - * If this fails, it usually just indicates that we're dealing with a - * virq or IPI channel, which don't actually need to be rebound. Ignore - * it, but don't do the xenlinux-level rebind in that case. - */ - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) - bind_evtchn_to_cpu(evtchn, tcpu); -} - - -static void set_affinity_irq(unsigned irq, cpumask_t dest) -{ - unsigned tcpu = first_cpu(dest); - rebind_irq_to_cpu(irq, tcpu); -} - -static void enable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - unmask_evtchn(evtchn); -} - -static void disable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - mask_evtchn(evtchn); -} - -static void ack_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - move_native_irq(irq); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); -} - -static int retrigger_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - int ret = 0; - - if (VALID_EVTCHN(evtchn)) { - set_evtchn(evtchn); - ret = 1; - } - - return ret; -} - -static struct irq_chip xen_dynamic_chip __read_mostly = { - .name = "xen-dyn", - .mask = disable_dynirq, - .unmask = enable_dynirq, - .ack = ack_dynirq, - .set_affinity = set_affinity_irq, - .retrigger = retrigger_dynirq, -}; - -void __init xen_init_IRQ(void) -{ - int i; - - init_evtchn_cpu_bindings(); - - /* No event channels are 'live' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) - mask_evtchn(i); - - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ - for (i = 0; i < NR_IRQS; i++) - irq_bindcount[i] = 0; - - irq_ctx_init(smp_processor_id()); -} diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile --- old/arch/x86/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/Makefile 2008-03-25 13:56:41.367764448 +0800 @@ -1,4 +1,4 @@ obj-y := enlighten.o setup.o features.o multicalls.o mmu.o \ - events.o time.o manage.o xen-asm.o + time.o manage.o xen-asm.o obj-$(CONFIG_SMP) += smp.o diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h --- old/arch/x86/xen/xen-ops.h 2008-03-25 13:21:09.996527604 +0800 +++ linux/arch/x86/xen/xen-ops.h 2008-03-25 13:59:16.349809137 +0800 @@ -2,6 +2,7 @@ #define XEN_OPS_H #include <linux/init.h> +#include <xen/xen-ops.h> /* These are code, but not functions. Defined in entry.S */ extern const char xen_hypervisor_callback[]; @@ -9,7 +10,6 @@ void xen_copy_trap_info(struct trap_info *traps); -DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); DECLARE_PER_CPU(unsigned long, xen_cr3); DECLARE_PER_CPU(unsigned long, xen_current_cr3); diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c --- old/drivers/xen/events.c 1970-01-01 08:00:00.000000000 +0800 +++ linux/drivers/xen/events.c 2008-03-25 13:56:41.368764287 +0800 @@ -0,0 +1,591 @@ +/* + * Xen event channels + * + * Xen models interrupts with abstract event channels. Because each + * domain gets 1024 event channels, but NR_IRQ is not that large, we + * must dynamically map irqs<->event channels. The event channels + * interface with the rest of the kernel by defining a xen interrupt + * chip. When an event is recieved, it is mapped to an irq and sent + * through the normal interrupt processing path. + * + * There are four kinds of events which can be mapped to an event + * channel: + * + * 1. Inter-domain notifications. This includes all the virtual + * device events, since they're driven by front-ends in another domain + * (typically dom0). + * 2. VIRQs, typically used for timers. These are per-cpu events. + * 3. IPIs. + * 4. Hardware interrupts. Not supported at present. + * + * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 + */ + +#include <linux/linkage.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/string.h> + +#include <asm/ptrace.h> +#include <asm/irq.h> +#include <asm/sync_bitops.h> +#include <asm/xen/hypercall.h> +#include <asm/xen/hypervisor.h> + +#include <xen/events.h> +#include <xen/interface/xen.h> +#include <xen/interface/event_channel.h> + +#include "xen-ops.h" + +/* + * This lock protects updates to the following mapping and reference-count + * arrays. The lock does not need to be acquired to read the mapping tables. + */ +static DEFINE_SPINLOCK(irq_mapping_update_lock); + +/* IRQ <-> VIRQ mapping. */ +static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; + +/* IRQ <-> IPI mapping */ +static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; + +/* Packed IRQ information: binding type, sub-type index, and event channel. */ +struct packed_irq +{ + unsigned short evtchn; + unsigned char index; + unsigned char type; +}; + +static struct packed_irq irq_info[NR_IRQS]; + +/* Binding types. */ +enum { + IRQT_UNBOUND, + IRQT_PIRQ, + IRQT_VIRQ, + IRQT_IPI, + IRQT_EVTCHN +}; + +/* Convenient shorthand for packed representation of an unbound IRQ. */ +#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) + +static int evtchn_to_irq[NR_EVENT_CHANNELS] = { + [0 ... NR_EVENT_CHANNELS-1] = -1 +}; +static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; +static u8 cpu_evtchn[NR_EVENT_CHANNELS]; + +/* Reference counts for bindings to IRQs. */ +static int irq_bindcount[NR_IRQS]; + +/* Xen will never allocate port zero for any purpose. */ +#define VALID_EVTCHN(chn) ((chn) != 0) + +/* + * Force a proper event-channel callback from Xen after clearing the + * callback mask. We do this in a very simple manner, by making a call + * down into Xen. The pending flag will be checked by Xen on return. + */ +void force_evtchn_callback(void) +{ + (void)HYPERVISOR_xen_version(0, NULL); +} +EXPORT_SYMBOL_GPL(force_evtchn_callback); + +static struct irq_chip xen_dynamic_chip; + +/* Constructor for packed IRQ information. */ +static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) +{ + return (struct packed_irq) { evtchn, index, type }; +} + +/* + * Accessors for packed IRQ information. + */ +static inline unsigned int evtchn_from_irq(int irq) +{ + return irq_info[irq].evtchn; +} + +static inline unsigned int index_from_irq(int irq) +{ + return irq_info[irq].index; +} + +static inline unsigned int type_from_irq(int irq) +{ + return irq_info[irq].type; +} + +static inline unsigned long active_evtchns(unsigned int cpu, + struct shared_info *sh, + unsigned int idx) +{ + return (sh->evtchn_pending[idx] & + cpu_evtchn_mask[cpu][idx] & + ~sh->evtchn_mask[idx]); +} + +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) +{ + int irq = evtchn_to_irq[chn]; + + BUG_ON(irq == -1); +#ifdef CONFIG_SMP + irq_desc[irq].affinity = cpumask_of_cpu(cpu); +#endif + + __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); + __set_bit(chn, cpu_evtchn_mask[cpu]); + + cpu_evtchn[chn] = cpu; +} + +static void init_evtchn_cpu_bindings(void) +{ +#ifdef CONFIG_SMP + int i; + /* By default all event channels notify CPU#0. */ + for (i = 0; i < NR_IRQS; i++) + irq_desc[i].affinity = cpumask_of_cpu(0); +#endif + + memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); + memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); +} + +static inline unsigned int cpu_from_evtchn(unsigned int evtchn) +{ + return cpu_evtchn[evtchn]; +} + +static inline void clear_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_clear_bit(port, &s->evtchn_pending[0]); +} + +static inline void set_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_pending[0]); +} + + +/** + * notify_remote_via_irq - send event to remote end of event channel via irq + * @irq: irq of event channel to send event to + * + * Unlike notify_remote_via_evtchn(), this is safe to use across + * save/restore. Notifications on a broken connection are silently + * dropped. + */ +void notify_remote_via_irq(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + notify_remote_via_evtchn(evtchn); +} +EXPORT_SYMBOL_GPL(notify_remote_via_irq); + +static void mask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_mask[0]); +} + +static void unmask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + unsigned int cpu = get_cpu(); + + BUG_ON(!irqs_disabled()); + + /* Slow path (hypercall) if this is a non-local port. */ + if (unlikely(cpu != cpu_from_evtchn(port))) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } else { + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + + sync_clear_bit(port, &s->evtchn_mask[0]); + + /* + * The following is basically the equivalent of + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose + * the interrupt edge' if the channel is masked. + */ + if (sync_test_bit(port, &s->evtchn_pending[0]) && + !sync_test_and_set_bit(port / BITS_PER_LONG, + &vcpu_info->evtchn_pending_sel)) + vcpu_info->evtchn_upcall_pending = 1; + } + + put_cpu(); +} + +static int find_unbound_irq(void) +{ + int irq; + + /* Only allocate from dynirq range */ + for (irq = 0; irq < NR_IRQS; irq++) + if (irq_bindcount[irq] == 0) + break; + + if (irq == NR_IRQS) + panic("No available IRQ to bind to: increase NR_IRQS!\n"); + + return irq; +} + +int bind_evtchn_to_irq(unsigned int evtchn) +{ + int irq; + + spin_lock(&irq_mapping_update_lock); + + irq = evtchn_to_irq[evtchn]; + + if (irq == -1) { + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "event"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); + +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) +{ + struct evtchn_bind_ipi bind_ipi; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(ipi_to_irq, cpu)[ipi]; + if (irq == -1) { + irq = find_unbound_irq(); + if (irq < 0) + goto out; + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "ipi"); + + bind_ipi.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, + &bind_ipi) != 0) + BUG(); + evtchn = bind_ipi.port; + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); + + per_cpu(ipi_to_irq, cpu)[ipi] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + out: + spin_unlock(&irq_mapping_update_lock); + return irq; +} + + +static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) +{ + struct evtchn_bind_virq bind_virq; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(virq_to_irq, cpu)[virq]; + + if (irq == -1) { + bind_virq.virq = virq; + bind_virq.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, + &bind_virq) != 0) + BUG(); + evtchn = bind_virq.port; + + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "virq"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); + + per_cpu(virq_to_irq, cpu)[virq] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} + +static void unbind_from_irq(unsigned int irq) +{ + struct evtchn_close close; + int evtchn = evtchn_from_irq(irq); + + spin_lock(&irq_mapping_update_lock); + + if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + switch (type_from_irq(irq)) { + case IRQT_VIRQ: + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) + [index_from_irq(irq)] = -1; + break; + default: + break; + } + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(evtchn, 0); + + evtchn_to_irq[evtchn] = -1; + irq_info[irq] = IRQ_UNBOUND; + + dynamic_irq_init(irq); + } + + spin_unlock(&irq_mapping_update_lock); +} + +int bind_evtchn_to_irqhandler(unsigned int evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_evtchn_to_irq(evtchn); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); + +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_virq_to_irq(virq, cpu); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); + +int bind_ipi_to_irqhandler(enum ipi_vector ipi, + unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + int irq, retval; + + irq = bind_ipi_to_irq(ipi, cpu); + if (irq < 0) + return irq; + + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} + +void unbind_from_irqhandler(unsigned int irq, void *dev_id) +{ + free_irq(irq, dev_id); + unbind_from_irq(irq); +} +EXPORT_SYMBOL_GPL(unbind_from_irqhandler); + +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) +{ + int irq = per_cpu(ipi_to_irq, cpu)[vector]; + BUG_ON(irq < 0); + notify_remote_via_irq(irq); +} + + +/* + * Search the CPUs pending events bitmasks. For each one found, map + * the event number to an irq, and feed it into do_IRQ() for + * handling. + * + * Xen uses a two-level bitmap to speed searching. The first level is + * a bitset of words which contain pending event bits. The second + * level is a bitset of pending events themselves. + */ +void xen_evtchn_do_upcall(struct pt_regs *regs) +{ + int cpu = get_cpu(); + struct shared_info *s = HYPERVISOR_shared_info; + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + unsigned long pending_words; + + vcpu_info->evtchn_upcall_pending = 0; + + /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ + pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); + while (pending_words != 0) { + unsigned long pending_bits; + int word_idx = __ffs(pending_words); + pending_words &= ~(1UL << word_idx); + + while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { + int bit_idx = __ffs(pending_bits); + int port = (word_idx * BITS_PER_LONG) + bit_idx; + int irq = evtchn_to_irq[port]; + + if (irq != -1) { + regs->orig_ax = ~irq; + do_IRQ(regs); + } + } + } + + put_cpu(); +} + +/* Rebind an evtchn so that it gets delivered to a specific cpu */ +static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) +{ + struct evtchn_bind_vcpu bind_vcpu; + int evtchn = evtchn_from_irq(irq); + + if (!VALID_EVTCHN(evtchn)) + return; + + /* Send future instances of this interrupt to other vcpu. */ + bind_vcpu.port = evtchn; + bind_vcpu.vcpu = tcpu; + + /* + * If this fails, it usually just indicates that we're dealing with a + * virq or IPI channel, which don't actually need to be rebound. Ignore + * it, but don't do the xenlinux-level rebind in that case. + */ + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) + bind_evtchn_to_cpu(evtchn, tcpu); +} + + +static void set_affinity_irq(unsigned irq, cpumask_t dest) +{ + unsigned tcpu = first_cpu(dest); + rebind_irq_to_cpu(irq, tcpu); +} + +static void enable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + unmask_evtchn(evtchn); +} + +static void disable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + mask_evtchn(evtchn); +} + +static void ack_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + move_native_irq(irq); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); +} + +static int retrigger_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + int ret = 0; + + if (VALID_EVTCHN(evtchn)) { + set_evtchn(evtchn); + ret = 1; + } + + return ret; +} + +static struct irq_chip xen_dynamic_chip __read_mostly = { + .name = "xen-dyn", + .mask = disable_dynirq, + .unmask = enable_dynirq, + .ack = ack_dynirq, + .set_affinity = set_affinity_irq, + .retrigger = retrigger_dynirq, +}; + +void __init xen_init_IRQ(void) +{ + int i; + + init_evtchn_cpu_bindings(); + + /* No event channels are 'live' right now. */ + for (i = 0; i < NR_EVENT_CHANNELS; i++) + mask_evtchn(i); + + /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ + for (i = 0; i < NR_IRQS; i++) + irq_bindcount[i] = 0; + + irq_ctx_init(smp_processor_id()); +} diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile --- old/drivers/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/drivers/xen/Makefile 2008-03-25 13:56:41.368764287 +0800 @@ -1,2 +1,2 @@ -obj-y += grant-table.o +obj-y += grant-table.o events.o obj-y += xenbus/ diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h --- old/include/xen/xen-ops.h 1970-01-01 08:00:00.000000000 +0800 +++ linux/include/xen/xen-ops.h 2008-03-25 14:00:09.041321546 +0800 @@ -0,0 +1,6 @@ +#ifndef INCLUDE_XEN_OPS_H +#define INCLUDE_XEN_OPS_H + +DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); + +#endif /* INCLUDE_XEN_OPS_H */ ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Xen common code across architecture 2008-03-25 6:13 ` Dong, Eddie @ 2008-03-25 6:35 ` Dong, Eddie 2008-03-25 15:08 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 7+ messages in thread From: Dong, Eddie @ 2008-03-25 6:35 UTC (permalink / raw) To: Dong, Eddie, Jeremy Fitzhardinge Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64, kvm-ia64-devel, xen-ia64-devel [-- Attachment #1: Type: text/plain, Size: 980 bytes --] Dong, Eddie wrote: > Jeremy/Andrew: > > Isaku Yamahata, I and some other IA64/Xen community memebers are > > working together to enable pv_ops for IA64 Linux. This patch is a > preparation to > move common arch/x86/xen/events.c to drivers/xen (contents are > identical) against > mm tree, it is based on Yamahata's IA64/pv_ops patch serie. > In case you want to have a brief view of whole pv_ops/IA64 patch > serie, > please refer to IA64 Linux mailinglist. > > Thanks, Eddie > > Fix a typo. Merged one is attached too. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> --- drivers/xen/events_old.c 2008-03-25 14:31:40.503525471 +0800 +++ drivers/xen/events.c 2008-03-25 14:19:39.841851430 +0800 @@ -37,7 +37,7 @@ #include <xen/interface/xen.h> #include <xen/interface/event_channel.h> -#include "xen-ops.h" +#include <xen/xen-ops.h> /* * This lock protects updates to the following mapping and reference-count [-- Attachment #2: typo --] [-- Type: application/octet-stream, Size: 350 bytes --] --- drivers/xen/events_old.c 2008-03-25 14:31:40.503525471 +0800 +++ drivers/xen/events.c 2008-03-25 14:19:39.841851430 +0800 @@ -37,7 +37,7 @@ #include <xen/interface/xen.h> #include <xen/interface/event_channel.h> -#include "xen-ops.h" +#include <xen/xen-ops.h> /* * This lock protects updates to the following mapping and reference-count [-- Attachment #3: move_xenirq3.patch --] [-- Type: application/octet-stream, Size: 30864 bytes --] Move events.c to drivers/xen for IA64/Xen support. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c --- old/arch/x86/xen/events.c 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/events.c 1970-01-01 08:00:00.000000000 +0800 @@ -1,591 +0,0 @@ -/* - * Xen event channels - * - * Xen models interrupts with abstract event channels. Because each - * domain gets 1024 event channels, but NR_IRQ is not that large, we - * must dynamically map irqs<->event channels. The event channels - * interface with the rest of the kernel by defining a xen interrupt - * chip. When an event is recieved, it is mapped to an irq and sent - * through the normal interrupt processing path. - * - * There are four kinds of events which can be mapped to an event - * channel: - * - * 1. Inter-domain notifications. This includes all the virtual - * device events, since they're driven by front-ends in another domain - * (typically dom0). - * 2. VIRQs, typically used for timers. These are per-cpu events. - * 3. IPIs. - * 4. Hardware interrupts. Not supported at present. - * - * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 - */ - -#include <linux/linkage.h> -#include <linux/interrupt.h> -#include <linux/irq.h> -#include <linux/module.h> -#include <linux/string.h> - -#include <asm/ptrace.h> -#include <asm/irq.h> -#include <asm/sync_bitops.h> -#include <asm/xen/hypercall.h> -#include <asm/xen/hypervisor.h> - -#include <xen/events.h> -#include <xen/interface/xen.h> -#include <xen/interface/event_channel.h> - -#include "xen-ops.h" - -/* - * This lock protects updates to the following mapping and reference-count - * arrays. The lock does not need to be acquired to read the mapping tables. - */ -static DEFINE_SPINLOCK(irq_mapping_update_lock); - -/* IRQ <-> VIRQ mapping. */ -static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; - -/* IRQ <-> IPI mapping */ -static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; - -/* Packed IRQ information: binding type, sub-type index, and event channel. */ -struct packed_irq -{ - unsigned short evtchn; - unsigned char index; - unsigned char type; -}; - -static struct packed_irq irq_info[NR_IRQS]; - -/* Binding types. */ -enum { - IRQT_UNBOUND, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_EVTCHN -}; - -/* Convenient shorthand for packed representation of an unbound IRQ. */ -#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) - -static int evtchn_to_irq[NR_EVENT_CHANNELS] = { - [0 ... NR_EVENT_CHANNELS-1] = -1 -}; -static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; -static u8 cpu_evtchn[NR_EVENT_CHANNELS]; - -/* Reference counts for bindings to IRQs. */ -static int irq_bindcount[NR_IRQS]; - -/* Xen will never allocate port zero for any purpose. */ -#define VALID_EVTCHN(chn) ((chn) != 0) - -/* - * Force a proper event-channel callback from Xen after clearing the - * callback mask. We do this in a very simple manner, by making a call - * down into Xen. The pending flag will be checked by Xen on return. - */ -void force_evtchn_callback(void) -{ - (void)HYPERVISOR_xen_version(0, NULL); -} -EXPORT_SYMBOL_GPL(force_evtchn_callback); - -static struct irq_chip xen_dynamic_chip; - -/* Constructor for packed IRQ information. */ -static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) -{ - return (struct packed_irq) { evtchn, index, type }; -} - -/* - * Accessors for packed IRQ information. - */ -static inline unsigned int evtchn_from_irq(int irq) -{ - return irq_info[irq].evtchn; -} - -static inline unsigned int index_from_irq(int irq) -{ - return irq_info[irq].index; -} - -static inline unsigned int type_from_irq(int irq) -{ - return irq_info[irq].type; -} - -static inline unsigned long active_evtchns(unsigned int cpu, - struct shared_info *sh, - unsigned int idx) -{ - return (sh->evtchn_pending[idx] & - cpu_evtchn_mask[cpu][idx] & - ~sh->evtchn_mask[idx]); -} - -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) -{ - int irq = evtchn_to_irq[chn]; - - BUG_ON(irq == -1); -#ifdef CONFIG_SMP - irq_desc[irq].affinity = cpumask_of_cpu(cpu); -#endif - - __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); - __set_bit(chn, cpu_evtchn_mask[cpu]); - - cpu_evtchn[chn] = cpu; -} - -static void init_evtchn_cpu_bindings(void) -{ -#ifdef CONFIG_SMP - int i; - /* By default all event channels notify CPU#0. */ - for (i = 0; i < NR_IRQS; i++) - irq_desc[i].affinity = cpumask_of_cpu(0); -#endif - - memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); - memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); -} - -static inline unsigned int cpu_from_evtchn(unsigned int evtchn) -{ - return cpu_evtchn[evtchn]; -} - -static inline void clear_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, &s->evtchn_pending[0]); -} - -static inline void set_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_pending[0]); -} - - -/** - * notify_remote_via_irq - send event to remote end of event channel via irq - * @irq: irq of event channel to send event to - * - * Unlike notify_remote_via_evtchn(), this is safe to use across - * save/restore. Notifications on a broken connection are silently - * dropped. - */ -void notify_remote_via_irq(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - notify_remote_via_evtchn(evtchn); -} -EXPORT_SYMBOL_GPL(notify_remote_via_irq); - -static void mask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_mask[0]); -} - -static void unmask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - unsigned int cpu = get_cpu(); - - BUG_ON(!irqs_disabled()); - - /* Slow path (hypercall) if this is a non-local port. */ - if (unlikely(cpu != cpu_from_evtchn(port))) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - } else { - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - - sync_clear_bit(port, &s->evtchn_mask[0]); - - /* - * The following is basically the equivalent of - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose - * the interrupt edge' if the channel is masked. - */ - if (sync_test_bit(port, &s->evtchn_pending[0]) && - !sync_test_and_set_bit(port / BITS_PER_LONG, - &vcpu_info->evtchn_pending_sel)) - vcpu_info->evtchn_upcall_pending = 1; - } - - put_cpu(); -} - -static int find_unbound_irq(void) -{ - int irq; - - /* Only allocate from dynirq range */ - for (irq = 0; irq < NR_IRQS; irq++) - if (irq_bindcount[irq] == 0) - break; - - if (irq == NR_IRQS) - panic("No available IRQ to bind to: increase NR_IRQS!\n"); - - return irq; -} - -int bind_evtchn_to_irq(unsigned int evtchn) -{ - int irq; - - spin_lock(&irq_mapping_update_lock); - - irq = evtchn_to_irq[evtchn]; - - if (irq == -1) { - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "event"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); - -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) -{ - struct evtchn_bind_ipi bind_ipi; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(ipi_to_irq, cpu)[ipi]; - if (irq == -1) { - irq = find_unbound_irq(); - if (irq < 0) - goto out; - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "ipi"); - - bind_ipi.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, - &bind_ipi) != 0) - BUG(); - evtchn = bind_ipi.port; - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); - - per_cpu(ipi_to_irq, cpu)[ipi] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - out: - spin_unlock(&irq_mapping_update_lock); - return irq; -} - - -static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) -{ - struct evtchn_bind_virq bind_virq; - int evtchn, irq; - - spin_lock(&irq_mapping_update_lock); - - irq = per_cpu(virq_to_irq, cpu)[virq]; - - if (irq == -1) { - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, - &bind_virq) != 0) - BUG(); - evtchn = bind_virq.port; - - irq = find_unbound_irq(); - - dynamic_irq_init(irq); - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_level_irq, "virq"); - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); - - per_cpu(virq_to_irq, cpu)[virq] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - - spin_unlock(&irq_mapping_update_lock); - - return irq; -} - -static void unbind_from_irq(unsigned int irq) -{ - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - - spin_lock(&irq_mapping_update_lock); - - if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [index_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - irq_info[irq] = IRQ_UNBOUND; - - dynamic_irq_init(irq); - } - - spin_unlock(&irq_mapping_update_lock); -} - -int bind_evtchn_to_irqhandler(unsigned int evtchn, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_evtchn_to_irq(evtchn); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); - -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, const char *devname, void *dev_id) -{ - unsigned int irq; - int retval; - - irq = bind_virq_to_irq(virq, cpu); - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); - -int bind_ipi_to_irqhandler(enum ipi_vector ipi, - unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, - void *dev_id) -{ - int irq, retval; - - irq = bind_ipi_to_irq(ipi, cpu); - if (irq < 0) - return irq; - - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} - -void unbind_from_irqhandler(unsigned int irq, void *dev_id) -{ - free_irq(irq, dev_id); - unbind_from_irq(irq); -} -EXPORT_SYMBOL_GPL(unbind_from_irqhandler); - -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) -{ - int irq = per_cpu(ipi_to_irq, cpu)[vector]; - BUG_ON(irq < 0); - notify_remote_via_irq(irq); -} - - -/* - * Search the CPUs pending events bitmasks. For each one found, map - * the event number to an irq, and feed it into do_IRQ() for - * handling. - * - * Xen uses a two-level bitmap to speed searching. The first level is - * a bitset of words which contain pending event bits. The second - * level is a bitset of pending events themselves. - */ -void xen_evtchn_do_upcall(struct pt_regs *regs) -{ - int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); - unsigned long pending_words; - - vcpu_info->evtchn_upcall_pending = 0; - - /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ - pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); - while (pending_words != 0) { - unsigned long pending_bits; - int word_idx = __ffs(pending_words); - pending_words &= ~(1UL << word_idx); - - while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { - int bit_idx = __ffs(pending_bits); - int port = (word_idx * BITS_PER_LONG) + bit_idx; - int irq = evtchn_to_irq[port]; - - if (irq != -1) { - regs->orig_ax = ~irq; - do_IRQ(regs); - } - } - } - - put_cpu(); -} - -/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) -{ - struct evtchn_bind_vcpu bind_vcpu; - int evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - return; - - /* Send future instances of this interrupt to other vcpu. */ - bind_vcpu.port = evtchn; - bind_vcpu.vcpu = tcpu; - - /* - * If this fails, it usually just indicates that we're dealing with a - * virq or IPI channel, which don't actually need to be rebound. Ignore - * it, but don't do the xenlinux-level rebind in that case. - */ - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) - bind_evtchn_to_cpu(evtchn, tcpu); -} - - -static void set_affinity_irq(unsigned irq, cpumask_t dest) -{ - unsigned tcpu = first_cpu(dest); - rebind_irq_to_cpu(irq, tcpu); -} - -static void enable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - unmask_evtchn(evtchn); -} - -static void disable_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - mask_evtchn(evtchn); -} - -static void ack_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - - move_native_irq(irq); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); -} - -static int retrigger_dynirq(unsigned int irq) -{ - int evtchn = evtchn_from_irq(irq); - int ret = 0; - - if (VALID_EVTCHN(evtchn)) { - set_evtchn(evtchn); - ret = 1; - } - - return ret; -} - -static struct irq_chip xen_dynamic_chip __read_mostly = { - .name = "xen-dyn", - .mask = disable_dynirq, - .unmask = enable_dynirq, - .ack = ack_dynirq, - .set_affinity = set_affinity_irq, - .retrigger = retrigger_dynirq, -}; - -void __init xen_init_IRQ(void) -{ - int i; - - init_evtchn_cpu_bindings(); - - /* No event channels are 'live' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) - mask_evtchn(i); - - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ - for (i = 0; i < NR_IRQS; i++) - irq_bindcount[i] = 0; - - irq_ctx_init(smp_processor_id()); -} diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile --- old/arch/x86/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/arch/x86/xen/Makefile 2008-03-25 13:56:41.367764448 +0800 @@ -1,4 +1,4 @@ obj-y := enlighten.o setup.o features.o multicalls.o mmu.o \ - events.o time.o manage.o xen-asm.o + time.o manage.o xen-asm.o obj-$(CONFIG_SMP) += smp.o diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h --- old/arch/x86/xen/xen-ops.h 2008-03-25 13:21:09.996527604 +0800 +++ linux/arch/x86/xen/xen-ops.h 2008-03-25 13:59:16.349809137 +0800 @@ -2,6 +2,7 @@ #define XEN_OPS_H #include <linux/init.h> +#include <xen/xen-ops.h> /* These are code, but not functions. Defined in entry.S */ extern const char xen_hypervisor_callback[]; @@ -9,7 +10,6 @@ void xen_copy_trap_info(struct trap_info *traps); -DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); DECLARE_PER_CPU(unsigned long, xen_cr3); DECLARE_PER_CPU(unsigned long, xen_current_cr3); diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c --- old/drivers/xen/events.c 1970-01-01 08:00:00.000000000 +0800 +++ linux/drivers/xen/events.c 2008-03-25 13:56:41.368764287 +0800 @@ -0,0 +1,591 @@ +/* + * Xen event channels + * + * Xen models interrupts with abstract event channels. Because each + * domain gets 1024 event channels, but NR_IRQ is not that large, we + * must dynamically map irqs<->event channels. The event channels + * interface with the rest of the kernel by defining a xen interrupt + * chip. When an event is recieved, it is mapped to an irq and sent + * through the normal interrupt processing path. + * + * There are four kinds of events which can be mapped to an event + * channel: + * + * 1. Inter-domain notifications. This includes all the virtual + * device events, since they're driven by front-ends in another domain + * (typically dom0). + * 2. VIRQs, typically used for timers. These are per-cpu events. + * 3. IPIs. + * 4. Hardware interrupts. Not supported at present. + * + * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 + */ + +#include <linux/linkage.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/string.h> + +#include <asm/ptrace.h> +#include <asm/irq.h> +#include <asm/sync_bitops.h> +#include <asm/xen/hypercall.h> +#include <asm/xen/hypervisor.h> + +#include <xen/events.h> +#include <xen/interface/xen.h> +#include <xen/interface/event_channel.h> + +#include <xen/xen-ops.h> + +/* + * This lock protects updates to the following mapping and reference-count + * arrays. The lock does not need to be acquired to read the mapping tables. + */ +static DEFINE_SPINLOCK(irq_mapping_update_lock); + +/* IRQ <-> VIRQ mapping. */ +static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1}; + +/* IRQ <-> IPI mapping */ +static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1}; + +/* Packed IRQ information: binding type, sub-type index, and event channel. */ +struct packed_irq +{ + unsigned short evtchn; + unsigned char index; + unsigned char type; +}; + +static struct packed_irq irq_info[NR_IRQS]; + +/* Binding types. */ +enum { + IRQT_UNBOUND, + IRQT_PIRQ, + IRQT_VIRQ, + IRQT_IPI, + IRQT_EVTCHN +}; + +/* Convenient shorthand for packed representation of an unbound IRQ. */ +#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) + +static int evtchn_to_irq[NR_EVENT_CHANNELS] = { + [0 ... NR_EVENT_CHANNELS-1] = -1 +}; +static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; +static u8 cpu_evtchn[NR_EVENT_CHANNELS]; + +/* Reference counts for bindings to IRQs. */ +static int irq_bindcount[NR_IRQS]; + +/* Xen will never allocate port zero for any purpose. */ +#define VALID_EVTCHN(chn) ((chn) != 0) + +/* + * Force a proper event-channel callback from Xen after clearing the + * callback mask. We do this in a very simple manner, by making a call + * down into Xen. The pending flag will be checked by Xen on return. + */ +void force_evtchn_callback(void) +{ + (void)HYPERVISOR_xen_version(0, NULL); +} +EXPORT_SYMBOL_GPL(force_evtchn_callback); + +static struct irq_chip xen_dynamic_chip; + +/* Constructor for packed IRQ information. */ +static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn) +{ + return (struct packed_irq) { evtchn, index, type }; +} + +/* + * Accessors for packed IRQ information. + */ +static inline unsigned int evtchn_from_irq(int irq) +{ + return irq_info[irq].evtchn; +} + +static inline unsigned int index_from_irq(int irq) +{ + return irq_info[irq].index; +} + +static inline unsigned int type_from_irq(int irq) +{ + return irq_info[irq].type; +} + +static inline unsigned long active_evtchns(unsigned int cpu, + struct shared_info *sh, + unsigned int idx) +{ + return (sh->evtchn_pending[idx] & + cpu_evtchn_mask[cpu][idx] & + ~sh->evtchn_mask[idx]); +} + +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) +{ + int irq = evtchn_to_irq[chn]; + + BUG_ON(irq == -1); +#ifdef CONFIG_SMP + irq_desc[irq].affinity = cpumask_of_cpu(cpu); +#endif + + __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); + __set_bit(chn, cpu_evtchn_mask[cpu]); + + cpu_evtchn[chn] = cpu; +} + +static void init_evtchn_cpu_bindings(void) +{ +#ifdef CONFIG_SMP + int i; + /* By default all event channels notify CPU#0. */ + for (i = 0; i < NR_IRQS; i++) + irq_desc[i].affinity = cpumask_of_cpu(0); +#endif + + memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); + memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); +} + +static inline unsigned int cpu_from_evtchn(unsigned int evtchn) +{ + return cpu_evtchn[evtchn]; +} + +static inline void clear_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_clear_bit(port, &s->evtchn_pending[0]); +} + +static inline void set_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_pending[0]); +} + + +/** + * notify_remote_via_irq - send event to remote end of event channel via irq + * @irq: irq of event channel to send event to + * + * Unlike notify_remote_via_evtchn(), this is safe to use across + * save/restore. Notifications on a broken connection are silently + * dropped. + */ +void notify_remote_via_irq(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + notify_remote_via_evtchn(evtchn); +} +EXPORT_SYMBOL_GPL(notify_remote_via_irq); + +static void mask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, &s->evtchn_mask[0]); +} + +static void unmask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + unsigned int cpu = get_cpu(); + + BUG_ON(!irqs_disabled()); + + /* Slow path (hypercall) if this is a non-local port. */ + if (unlikely(cpu != cpu_from_evtchn(port))) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } else { + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + + sync_clear_bit(port, &s->evtchn_mask[0]); + + /* + * The following is basically the equivalent of + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose + * the interrupt edge' if the channel is masked. + */ + if (sync_test_bit(port, &s->evtchn_pending[0]) && + !sync_test_and_set_bit(port / BITS_PER_LONG, + &vcpu_info->evtchn_pending_sel)) + vcpu_info->evtchn_upcall_pending = 1; + } + + put_cpu(); +} + +static int find_unbound_irq(void) +{ + int irq; + + /* Only allocate from dynirq range */ + for (irq = 0; irq < NR_IRQS; irq++) + if (irq_bindcount[irq] == 0) + break; + + if (irq == NR_IRQS) + panic("No available IRQ to bind to: increase NR_IRQS!\n"); + + return irq; +} + +int bind_evtchn_to_irq(unsigned int evtchn) +{ + int irq; + + spin_lock(&irq_mapping_update_lock); + + irq = evtchn_to_irq[evtchn]; + + if (irq == -1) { + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "event"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); + +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) +{ + struct evtchn_bind_ipi bind_ipi; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(ipi_to_irq, cpu)[ipi]; + if (irq == -1) { + irq = find_unbound_irq(); + if (irq < 0) + goto out; + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "ipi"); + + bind_ipi.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, + &bind_ipi) != 0) + BUG(); + evtchn = bind_ipi.port; + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); + + per_cpu(ipi_to_irq, cpu)[ipi] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + out: + spin_unlock(&irq_mapping_update_lock); + return irq; +} + + +static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) +{ + struct evtchn_bind_virq bind_virq; + int evtchn, irq; + + spin_lock(&irq_mapping_update_lock); + + irq = per_cpu(virq_to_irq, cpu)[virq]; + + if (irq == -1) { + bind_virq.virq = virq; + bind_virq.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, + &bind_virq) != 0) + BUG(); + evtchn = bind_virq.port; + + irq = find_unbound_irq(); + + dynamic_irq_init(irq); + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_level_irq, "virq"); + + evtchn_to_irq[evtchn] = irq; + irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); + + per_cpu(virq_to_irq, cpu)[virq] = irq; + + bind_evtchn_to_cpu(evtchn, cpu); + } + + irq_bindcount[irq]++; + + spin_unlock(&irq_mapping_update_lock); + + return irq; +} + +static void unbind_from_irq(unsigned int irq) +{ + struct evtchn_close close; + int evtchn = evtchn_from_irq(irq); + + spin_lock(&irq_mapping_update_lock); + + if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) { + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + switch (type_from_irq(irq)) { + case IRQT_VIRQ: + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) + [index_from_irq(irq)] = -1; + break; + default: + break; + } + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(evtchn, 0); + + evtchn_to_irq[evtchn] = -1; + irq_info[irq] = IRQ_UNBOUND; + + dynamic_irq_init(irq); + } + + spin_unlock(&irq_mapping_update_lock); +} + +int bind_evtchn_to_irqhandler(unsigned int evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_evtchn_to_irq(evtchn); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); + +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, const char *devname, void *dev_id) +{ + unsigned int irq; + int retval; + + irq = bind_virq_to_irq(virq, cpu); + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); + +int bind_ipi_to_irqhandler(enum ipi_vector ipi, + unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + int irq, retval; + + irq = bind_ipi_to_irq(ipi, cpu); + if (irq < 0) + return irq; + + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} + +void unbind_from_irqhandler(unsigned int irq, void *dev_id) +{ + free_irq(irq, dev_id); + unbind_from_irq(irq); +} +EXPORT_SYMBOL_GPL(unbind_from_irqhandler); + +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) +{ + int irq = per_cpu(ipi_to_irq, cpu)[vector]; + BUG_ON(irq < 0); + notify_remote_via_irq(irq); +} + + +/* + * Search the CPUs pending events bitmasks. For each one found, map + * the event number to an irq, and feed it into do_IRQ() for + * handling. + * + * Xen uses a two-level bitmap to speed searching. The first level is + * a bitset of words which contain pending event bits. The second + * level is a bitset of pending events themselves. + */ +void xen_evtchn_do_upcall(struct pt_regs *regs) +{ + int cpu = get_cpu(); + struct shared_info *s = HYPERVISOR_shared_info; + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); + unsigned long pending_words; + + vcpu_info->evtchn_upcall_pending = 0; + + /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ + pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); + while (pending_words != 0) { + unsigned long pending_bits; + int word_idx = __ffs(pending_words); + pending_words &= ~(1UL << word_idx); + + while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) { + int bit_idx = __ffs(pending_bits); + int port = (word_idx * BITS_PER_LONG) + bit_idx; + int irq = evtchn_to_irq[port]; + + if (irq != -1) { + regs->orig_ax = ~irq; + do_IRQ(regs); + } + } + } + + put_cpu(); +} + +/* Rebind an evtchn so that it gets delivered to a specific cpu */ +static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) +{ + struct evtchn_bind_vcpu bind_vcpu; + int evtchn = evtchn_from_irq(irq); + + if (!VALID_EVTCHN(evtchn)) + return; + + /* Send future instances of this interrupt to other vcpu. */ + bind_vcpu.port = evtchn; + bind_vcpu.vcpu = tcpu; + + /* + * If this fails, it usually just indicates that we're dealing with a + * virq or IPI channel, which don't actually need to be rebound. Ignore + * it, but don't do the xenlinux-level rebind in that case. + */ + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) + bind_evtchn_to_cpu(evtchn, tcpu); +} + + +static void set_affinity_irq(unsigned irq, cpumask_t dest) +{ + unsigned tcpu = first_cpu(dest); + rebind_irq_to_cpu(irq, tcpu); +} + +static void enable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + unmask_evtchn(evtchn); +} + +static void disable_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + mask_evtchn(evtchn); +} + +static void ack_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + + move_native_irq(irq); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); +} + +static int retrigger_dynirq(unsigned int irq) +{ + int evtchn = evtchn_from_irq(irq); + int ret = 0; + + if (VALID_EVTCHN(evtchn)) { + set_evtchn(evtchn); + ret = 1; + } + + return ret; +} + +static struct irq_chip xen_dynamic_chip __read_mostly = { + .name = "xen-dyn", + .mask = disable_dynirq, + .unmask = enable_dynirq, + .ack = ack_dynirq, + .set_affinity = set_affinity_irq, + .retrigger = retrigger_dynirq, +}; + +void __init xen_init_IRQ(void) +{ + int i; + + init_evtchn_cpu_bindings(); + + /* No event channels are 'live' right now. */ + for (i = 0; i < NR_EVENT_CHANNELS; i++) + mask_evtchn(i); + + /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ + for (i = 0; i < NR_IRQS; i++) + irq_bindcount[i] = 0; + + irq_ctx_init(smp_processor_id()); +} diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile --- old/drivers/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 +++ linux/drivers/xen/Makefile 2008-03-25 13:56:41.368764287 +0800 @@ -1,2 +1,2 @@ -obj-y += grant-table.o +obj-y += grant-table.o events.o obj-y += xenbus/ diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h --- old/include/xen/xen-ops.h 1970-01-01 08:00:00.000000000 +0800 +++ linux/include/xen/xen-ops.h 2008-03-25 14:00:09.041321546 +0800 @@ -0,0 +1,6 @@ +#ifndef INCLUDE_XEN_OPS_H +#define INCLUDE_XEN_OPS_H + +DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); + +#endif /* INCLUDE_XEN_OPS_H */ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Xen common code across architecture 2008-03-25 6:13 ` Dong, Eddie 2008-03-25 6:35 ` Dong, Eddie @ 2008-03-25 15:08 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 7+ messages in thread From: Jeremy Fitzhardinge @ 2008-03-25 15:08 UTC (permalink / raw) To: Dong, Eddie Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64, kvm-ia64-devel, xen-ia64-devel Dong, Eddie wrote: > Jeremy/Andrew: > > Isaku Yamahata, I and some other IA64/Xen community memebers are > > working together to enable pv_ops for IA64 Linux. This patch is a > preparation to > move common arch/x86/xen/events.c to drivers/xen (contents are > identical) against > mm tree, it is based on Yamahata's IA64/pv_ops patch serie. > In case you want to have a brief view of whole pv_ops/IA64 patch > serie, > please refer to IA64 Linux mailinglist. > How do you want to manage this work? I'm currently basing off Ingo+tglx's x86.git tree. Would you like me to track these kinds of common-code changes in my tree, while you maintain a separate ia64-specific tree? > Thanks, Eddie > > > Move events.c to drivers/xen for IA64/Xen > support. > Looks reasonable. One comment below. > Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> > > diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c > --- old/arch/x86/xen/events.c 2008-03-10 13:22:27.000000000 +0800 > +++ linux/arch/x86/xen/events.c 1970-01-01 08:00:00.000000000 +0800 > @@ -1,591 +0,0 @@ > -/* > - * Xen event channels > - * > - * Xen models interrupts with abstract event channels. Because each > - * domain gets 1024 event channels, but NR_IRQ is not that large, we > - * must dynamically map irqs<->event channels. The event channels > - * interface with the rest of the kernel by defining a xen interrupt > - * chip. When an event is recieved, it is mapped to an irq and sent > - * through the normal interrupt processing path. > - * > - * There are four kinds of events which can be mapped to an event > - * channel: > - * > - * 1. Inter-domain notifications. This includes all the virtual > - * device events, since they're driven by front-ends in another > domain > - * (typically dom0). > - * 2. VIRQs, typically used for timers. These are per-cpu events. > - * 3. IPIs. > - * 4. Hardware interrupts. Not supported at present. > - * > - * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 > - */ > - > -#include <linux/linkage.h> > -#include <linux/interrupt.h> > -#include <linux/irq.h> > -#include <linux/module.h> > -#include <linux/string.h> > - > -#include <asm/ptrace.h> > -#include <asm/irq.h> > -#include <asm/sync_bitops.h> > -#include <asm/xen/hypercall.h> > -#include <asm/xen/hypervisor.h> > - > -#include <xen/events.h> > -#include <xen/interface/xen.h> > -#include <xen/interface/event_channel.h> > - > -#include "xen-ops.h" > - > -/* > - * This lock protects updates to the following mapping and > reference-count > - * arrays. The lock does not need to be acquired to read the mapping > tables. > - */ > -static DEFINE_SPINLOCK(irq_mapping_update_lock); > - > -/* IRQ <-> VIRQ mapping. */ > -static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] > = -1}; > - > -/* IRQ <-> IPI mapping */ > -static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... > XEN_NR_IPIS-1] = -1}; > - > -/* Packed IRQ information: binding type, sub-type index, and event > channel. */ > -struct packed_irq > -{ > - unsigned short evtchn; > - unsigned char index; > - unsigned char type; > -}; > - > -static struct packed_irq irq_info[NR_IRQS]; > - > -/* Binding types. */ > -enum { > - IRQT_UNBOUND, > - IRQT_PIRQ, > - IRQT_VIRQ, > - IRQT_IPI, > - IRQT_EVTCHN > -}; > - > -/* Convenient shorthand for packed representation of an unbound IRQ. */ > -#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) > - > -static int evtchn_to_irq[NR_EVENT_CHANNELS] = { > - [0 ... NR_EVENT_CHANNELS-1] = -1 > -}; > -static unsigned long > cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; > -static u8 cpu_evtchn[NR_EVENT_CHANNELS]; > - > -/* Reference counts for bindings to IRQs. */ > -static int irq_bindcount[NR_IRQS]; > - > -/* Xen will never allocate port zero for any purpose. */ > -#define VALID_EVTCHN(chn) ((chn) != 0) > - > -/* > - * Force a proper event-channel callback from Xen after clearing the > - * callback mask. We do this in a very simple manner, by making a call > - * down into Xen. The pending flag will be checked by Xen on return. > - */ > -void force_evtchn_callback(void) > -{ > - (void)HYPERVISOR_xen_version(0, NULL); > -} > -EXPORT_SYMBOL_GPL(force_evtchn_callback); > - > -static struct irq_chip xen_dynamic_chip; > - > -/* Constructor for packed IRQ information. */ > -static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 > evtchn) > -{ > - return (struct packed_irq) { evtchn, index, type }; > -} > - > -/* > - * Accessors for packed IRQ information. > - */ > -static inline unsigned int evtchn_from_irq(int irq) > -{ > - return irq_info[irq].evtchn; > -} > - > -static inline unsigned int index_from_irq(int irq) > -{ > - return irq_info[irq].index; > -} > - > -static inline unsigned int type_from_irq(int irq) > -{ > - return irq_info[irq].type; > -} > - > -static inline unsigned long active_evtchns(unsigned int cpu, > - struct shared_info *sh, > - unsigned int idx) > -{ > - return (sh->evtchn_pending[idx] & > - cpu_evtchn_mask[cpu][idx] & > - ~sh->evtchn_mask[idx]); > -} > - > -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > -{ > - int irq = evtchn_to_irq[chn]; > - > - BUG_ON(irq = -1); > -#ifdef CONFIG_SMP > - irq_desc[irq].affinity = cpumask_of_cpu(cpu); > -#endif > - > - __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); > - __set_bit(chn, cpu_evtchn_mask[cpu]); > - > - cpu_evtchn[chn] = cpu; > -} > - > -static void init_evtchn_cpu_bindings(void) > -{ > -#ifdef CONFIG_SMP > - int i; > - /* By default all event channels notify CPU#0. */ > - for (i = 0; i < NR_IRQS; i++) > - irq_desc[i].affinity = cpumask_of_cpu(0); > -#endif > - > - memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); > - memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); > -} > - > -static inline unsigned int cpu_from_evtchn(unsigned int evtchn) > -{ > - return cpu_evtchn[evtchn]; > -} > - > -static inline void clear_evtchn(int port) > -{ > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_clear_bit(port, &s->evtchn_pending[0]); > -} > - > -static inline void set_evtchn(int port) > -{ > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_set_bit(port, &s->evtchn_pending[0]); > -} > - > - > -/** > - * notify_remote_via_irq - send event to remote end of event channel > via irq > - * @irq: irq of event channel to send event to > - * > - * Unlike notify_remote_via_evtchn(), this is safe to use across > - * save/restore. Notifications on a broken connection are silently > - * dropped. > - */ > -void notify_remote_via_irq(int irq) > -{ > - int evtchn = evtchn_from_irq(irq); > - > - if (VALID_EVTCHN(evtchn)) > - notify_remote_via_evtchn(evtchn); > -} > -EXPORT_SYMBOL_GPL(notify_remote_via_irq); > - > -static void mask_evtchn(int port) > -{ > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_set_bit(port, &s->evtchn_mask[0]); > -} > - > -static void unmask_evtchn(int port) > -{ > - struct shared_info *s = HYPERVISOR_shared_info; > - unsigned int cpu = get_cpu(); > - > - BUG_ON(!irqs_disabled()); > - > - /* Slow path (hypercall) if this is a non-local port. */ > - if (unlikely(cpu != cpu_from_evtchn(port))) { > - struct evtchn_unmask unmask = { .port = port }; > - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, > &unmask); > - } else { > - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); > - > - sync_clear_bit(port, &s->evtchn_mask[0]); > - > - /* > - * The following is basically the equivalent of > - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose > - * the interrupt edge' if the channel is masked. > - */ > - if (sync_test_bit(port, &s->evtchn_pending[0]) && > - !sync_test_and_set_bit(port / BITS_PER_LONG, > - > &vcpu_info->evtchn_pending_sel)) > - vcpu_info->evtchn_upcall_pending = 1; > - } > - > - put_cpu(); > -} > - > -static int find_unbound_irq(void) > -{ > - int irq; > - > - /* Only allocate from dynirq range */ > - for (irq = 0; irq < NR_IRQS; irq++) > - if (irq_bindcount[irq] = 0) > - break; > - > - if (irq = NR_IRQS) > - panic("No available IRQ to bind to: increase > NR_IRQS!\n"); > - > - return irq; > -} > - > -int bind_evtchn_to_irq(unsigned int evtchn) > -{ > - int irq; > - > - spin_lock(&irq_mapping_update_lock); > - > - irq = evtchn_to_irq[evtchn]; > - > - if (irq = -1) { > - irq = find_unbound_irq(); > - > - dynamic_irq_init(irq); > - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > - handle_level_irq, > "event"); > - > - evtchn_to_irq[evtchn] = irq; > - irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); > - } > - > - irq_bindcount[irq]++; > - > - spin_unlock(&irq_mapping_update_lock); > - > - return irq; > -} > -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); > - > -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) > -{ > - struct evtchn_bind_ipi bind_ipi; > - int evtchn, irq; > - > - spin_lock(&irq_mapping_update_lock); > - > - irq = per_cpu(ipi_to_irq, cpu)[ipi]; > - if (irq = -1) { > - irq = find_unbound_irq(); > - if (irq < 0) > - goto out; > - > - dynamic_irq_init(irq); > - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > - handle_level_irq, "ipi"); > - > - bind_ipi.vcpu = cpu; > - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, > - &bind_ipi) != 0) > - BUG(); > - evtchn = bind_ipi.port; > - > - evtchn_to_irq[evtchn] = irq; > - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); > - > - per_cpu(ipi_to_irq, cpu)[ipi] = irq; > - > - bind_evtchn_to_cpu(evtchn, cpu); > - } > - > - irq_bindcount[irq]++; > - > - out: > - spin_unlock(&irq_mapping_update_lock); > - return irq; > -} > - > - > -static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) > -{ > - struct evtchn_bind_virq bind_virq; > - int evtchn, irq; > - > - spin_lock(&irq_mapping_update_lock); > - > - irq = per_cpu(virq_to_irq, cpu)[virq]; > - > - if (irq = -1) { > - bind_virq.virq = virq; > - bind_virq.vcpu = cpu; > - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, > - &bind_virq) != 0) > - BUG(); > - evtchn = bind_virq.port; > - > - irq = find_unbound_irq(); > - > - dynamic_irq_init(irq); > - set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > - handle_level_irq, "virq"); > - > - evtchn_to_irq[evtchn] = irq; > - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); > - > - per_cpu(virq_to_irq, cpu)[virq] = irq; > - > - bind_evtchn_to_cpu(evtchn, cpu); > - } > - > - irq_bindcount[irq]++; > - > - spin_unlock(&irq_mapping_update_lock); > - > - return irq; > -} > - > -static void unbind_from_irq(unsigned int irq) > -{ > - struct evtchn_close close; > - int evtchn = evtchn_from_irq(irq); > - > - spin_lock(&irq_mapping_update_lock); > - > - if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] = 0)) { > - close.port = evtchn; > - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) > != 0) > - BUG(); > - > - switch (type_from_irq(irq)) { > - case IRQT_VIRQ: > - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) > - [index_from_irq(irq)] = -1; > - break; > - default: > - break; > - } > - > - /* Closed ports are implicitly re-bound to VCPU0. */ > - bind_evtchn_to_cpu(evtchn, 0); > - > - evtchn_to_irq[evtchn] = -1; > - irq_info[irq] = IRQ_UNBOUND; > - > - dynamic_irq_init(irq); > - } > - > - spin_unlock(&irq_mapping_update_lock); > -} > - > -int bind_evtchn_to_irqhandler(unsigned int evtchn, > - irq_handler_t handler, > - unsigned long irqflags, > - const char *devname, void *dev_id) > -{ > - unsigned int irq; > - int retval; > - > - irq = bind_evtchn_to_irq(evtchn); > - retval = request_irq(irq, handler, irqflags, devname, dev_id); > - if (retval != 0) { > - unbind_from_irq(irq); > - return retval; > - } > - > - return irq; > -} > -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); > - > -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, > - irq_handler_t handler, > - unsigned long irqflags, const char *devname, > void *dev_id) > -{ > - unsigned int irq; > - int retval; > - > - irq = bind_virq_to_irq(virq, cpu); > - retval = request_irq(irq, handler, irqflags, devname, dev_id); > - if (retval != 0) { > - unbind_from_irq(irq); > - return retval; > - } > - > - return irq; > -} > -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); > - > -int bind_ipi_to_irqhandler(enum ipi_vector ipi, > - unsigned int cpu, > - irq_handler_t handler, > - unsigned long irqflags, > - const char *devname, > - void *dev_id) > -{ > - int irq, retval; > - > - irq = bind_ipi_to_irq(ipi, cpu); > - if (irq < 0) > - return irq; > - > - retval = request_irq(irq, handler, irqflags, devname, dev_id); > - if (retval != 0) { > - unbind_from_irq(irq); > - return retval; > - } > - > - return irq; > -} > - > -void unbind_from_irqhandler(unsigned int irq, void *dev_id) > -{ > - free_irq(irq, dev_id); > - unbind_from_irq(irq); > -} > -EXPORT_SYMBOL_GPL(unbind_from_irqhandler); > - > -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) > -{ > - int irq = per_cpu(ipi_to_irq, cpu)[vector]; > - BUG_ON(irq < 0); > - notify_remote_via_irq(irq); > -} > - > - > -/* > - * Search the CPUs pending events bitmasks. For each one found, map > - * the event number to an irq, and feed it into do_IRQ() for > - * handling. > - * > - * Xen uses a two-level bitmap to speed searching. The first level is > - * a bitset of words which contain pending event bits. The second > - * level is a bitset of pending events themselves. > - */ > -void xen_evtchn_do_upcall(struct pt_regs *regs) > -{ > - int cpu = get_cpu(); > - struct shared_info *s = HYPERVISOR_shared_info; > - struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); > - unsigned long pending_words; > - > - vcpu_info->evtchn_upcall_pending = 0; > - > - /* NB. No need for a barrier here -- XCHG is a barrier on x86. > */ > - pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); > - while (pending_words != 0) { > - unsigned long pending_bits; > - int word_idx = __ffs(pending_words); > - pending_words &= ~(1UL << word_idx); > - > - while ((pending_bits = active_evtchns(cpu, s, word_idx)) > != 0) { > - int bit_idx = __ffs(pending_bits); > - int port = (word_idx * BITS_PER_LONG) + bit_idx; > - int irq = evtchn_to_irq[port]; > - > - if (irq != -1) { > - regs->orig_ax = ~irq; > - do_IRQ(regs); > - } > - } > - } > - > - put_cpu(); > -} > - > -/* Rebind an evtchn so that it gets delivered to a specific cpu */ > -static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) > -{ > - struct evtchn_bind_vcpu bind_vcpu; > - int evtchn = evtchn_from_irq(irq); > - > - if (!VALID_EVTCHN(evtchn)) > - return; > - > - /* Send future instances of this interrupt to other vcpu. */ > - bind_vcpu.port = evtchn; > - bind_vcpu.vcpu = tcpu; > - > - /* > - * If this fails, it usually just indicates that we're dealing > with a > - * virq or IPI channel, which don't actually need to be rebound. > Ignore > - * it, but don't do the xenlinux-level rebind in that case. > - */ > - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) > >> = 0) >> > - bind_evtchn_to_cpu(evtchn, tcpu); > -} > - > - > -static void set_affinity_irq(unsigned irq, cpumask_t dest) > -{ > - unsigned tcpu = first_cpu(dest); > - rebind_irq_to_cpu(irq, tcpu); > -} > - > -static void enable_dynirq(unsigned int irq) > -{ > - int evtchn = evtchn_from_irq(irq); > - > - if (VALID_EVTCHN(evtchn)) > - unmask_evtchn(evtchn); > -} > - > -static void disable_dynirq(unsigned int irq) > -{ > - int evtchn = evtchn_from_irq(irq); > - > - if (VALID_EVTCHN(evtchn)) > - mask_evtchn(evtchn); > -} > - > -static void ack_dynirq(unsigned int irq) > -{ > - int evtchn = evtchn_from_irq(irq); > - > - move_native_irq(irq); > - > - if (VALID_EVTCHN(evtchn)) > - clear_evtchn(evtchn); > -} > - > -static int retrigger_dynirq(unsigned int irq) > -{ > - int evtchn = evtchn_from_irq(irq); > - int ret = 0; > - > - if (VALID_EVTCHN(evtchn)) { > - set_evtchn(evtchn); > - ret = 1; > - } > - > - return ret; > -} > - > -static struct irq_chip xen_dynamic_chip __read_mostly = { > - .name = "xen-dyn", > - .mask = disable_dynirq, > - .unmask = enable_dynirq, > - .ack = ack_dynirq, > - .set_affinity = set_affinity_irq, > - .retrigger = retrigger_dynirq, > -}; > - > -void __init xen_init_IRQ(void) > -{ > - int i; > - > - init_evtchn_cpu_bindings(); > - > - /* No event channels are 'live' right now. */ > - for (i = 0; i < NR_EVENT_CHANNELS; i++) > - mask_evtchn(i); > - > - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ > - for (i = 0; i < NR_IRQS; i++) > - irq_bindcount[i] = 0; > - > - irq_ctx_init(smp_processor_id()); > -} > diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile > --- old/arch/x86/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 > +++ linux/arch/x86/xen/Makefile 2008-03-25 13:56:41.367764448 +0800 > @@ -1,4 +1,4 @@ > obj-y := enlighten.o setup.o features.o multicalls.o mmu.o \ > - events.o time.o manage.o xen-asm.o > + time.o manage.o xen-asm.o > > obj-$(CONFIG_SMP) += smp.o > diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h > --- old/arch/x86/xen/xen-ops.h 2008-03-25 13:21:09.996527604 +0800 > +++ linux/arch/x86/xen/xen-ops.h 2008-03-25 13:59:16.349809137 > +0800 > @@ -2,6 +2,7 @@ > #define XEN_OPS_H > > #include <linux/init.h> > +#include <xen/xen-ops.h> > > /* These are code, but not functions. Defined in entry.S */ > extern const char xen_hypervisor_callback[]; > @@ -9,7 +10,6 @@ > > void xen_copy_trap_info(struct trap_info *traps); > > -DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); > DECLARE_PER_CPU(unsigned long, xen_cr3); > DECLARE_PER_CPU(unsigned long, xen_current_cr3); > > diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c > --- old/drivers/xen/events.c 1970-01-01 08:00:00.000000000 +0800 > +++ linux/drivers/xen/events.c 2008-03-25 13:56:41.368764287 +0800 > @@ -0,0 +1,591 @@ > +/* > + * Xen event channels > + * > + * Xen models interrupts with abstract event channels. Because each > + * domain gets 1024 event channels, but NR_IRQ is not that large, we > + * must dynamically map irqs<->event channels. The event channels > + * interface with the rest of the kernel by defining a xen interrupt > + * chip. When an event is recieved, it is mapped to an irq and sent > + * through the normal interrupt processing path. > + * > + * There are four kinds of events which can be mapped to an event > + * channel: > + * > + * 1. Inter-domain notifications. This includes all the virtual > + * device events, since they're driven by front-ends in another > domain > + * (typically dom0). > + * 2. VIRQs, typically used for timers. These are per-cpu events. > + * 3. IPIs. > + * 4. Hardware interrupts. Not supported at present. > + * > + * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007 > + */ > + > +#include <linux/linkage.h> > +#include <linux/interrupt.h> > +#include <linux/irq.h> > +#include <linux/module.h> > +#include <linux/string.h> > + > +#include <asm/ptrace.h> > +#include <asm/irq.h> > +#include <asm/sync_bitops.h> > +#include <asm/xen/hypercall.h> > +#include <asm/xen/hypervisor.h> > + > +#include <xen/events.h> > +#include <xen/interface/xen.h> > +#include <xen/interface/event_channel.h> > + > +#include "xen-ops.h" > + > +/* > + * This lock protects updates to the following mapping and > reference-count > + * arrays. The lock does not need to be acquired to read the mapping > tables. > + */ > +static DEFINE_SPINLOCK(irq_mapping_update_lock); > + > +/* IRQ <-> VIRQ mapping. */ > +static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] > = -1}; > + > +/* IRQ <-> IPI mapping */ > +static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... > XEN_NR_IPIS-1] = -1}; > + > +/* Packed IRQ information: binding type, sub-type index, and event > channel. */ > +struct packed_irq > +{ > + unsigned short evtchn; > + unsigned char index; > + unsigned char type; > +}; > + > +static struct packed_irq irq_info[NR_IRQS]; > + > +/* Binding types. */ > +enum { > + IRQT_UNBOUND, > + IRQT_PIRQ, > + IRQT_VIRQ, > + IRQT_IPI, > + IRQT_EVTCHN > +}; > + > +/* Convenient shorthand for packed representation of an unbound IRQ. */ > +#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) > + > +static int evtchn_to_irq[NR_EVENT_CHANNELS] = { > + [0 ... NR_EVENT_CHANNELS-1] = -1 > +}; > +static unsigned long > cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG]; > +static u8 cpu_evtchn[NR_EVENT_CHANNELS]; > + > +/* Reference counts for bindings to IRQs. */ > +static int irq_bindcount[NR_IRQS]; > + > +/* Xen will never allocate port zero for any purpose. */ > +#define VALID_EVTCHN(chn) ((chn) != 0) > + > +/* > + * Force a proper event-channel callback from Xen after clearing the > + * callback mask. We do this in a very simple manner, by making a call > + * down into Xen. The pending flag will be checked by Xen on return. > + */ > +void force_evtchn_callback(void) > +{ > + (void)HYPERVISOR_xen_version(0, NULL); > +} > +EXPORT_SYMBOL_GPL(force_evtchn_callback); > + > +static struct irq_chip xen_dynamic_chip; > + > +/* Constructor for packed IRQ information. */ > +static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 > evtchn) > +{ > + return (struct packed_irq) { evtchn, index, type }; > +} > + > +/* > + * Accessors for packed IRQ information. > + */ > +static inline unsigned int evtchn_from_irq(int irq) > +{ > + return irq_info[irq].evtchn; > +} > + > +static inline unsigned int index_from_irq(int irq) > +{ > + return irq_info[irq].index; > +} > + > +static inline unsigned int type_from_irq(int irq) > +{ > + return irq_info[irq].type; > +} > + > +static inline unsigned long active_evtchns(unsigned int cpu, > + struct shared_info *sh, > + unsigned int idx) > +{ > + return (sh->evtchn_pending[idx] & > + cpu_evtchn_mask[cpu][idx] & > + ~sh->evtchn_mask[idx]); > +} > + > +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > +{ > + int irq = evtchn_to_irq[chn]; > + > + BUG_ON(irq = -1); > +#ifdef CONFIG_SMP > + irq_desc[irq].affinity = cpumask_of_cpu(cpu); > +#endif > + > + __clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]); > + __set_bit(chn, cpu_evtchn_mask[cpu]); > + > + cpu_evtchn[chn] = cpu; > +} > + > +static void init_evtchn_cpu_bindings(void) > +{ > +#ifdef CONFIG_SMP > + int i; > + /* By default all event channels notify CPU#0. */ > + for (i = 0; i < NR_IRQS; i++) > + irq_desc[i].affinity = cpumask_of_cpu(0); > +#endif > + > + memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); > + memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); > +} > + > +static inline unsigned int cpu_from_evtchn(unsigned int evtchn) > +{ > + return cpu_evtchn[evtchn]; > +} > + > +static inline void clear_evtchn(int port) > +{ > + struct shared_info *s = HYPERVISOR_shared_info; > + sync_clear_bit(port, &s->evtchn_pending[0]); > +} > + > +static inline void set_evtchn(int port) > +{ > + struct shared_info *s = HYPERVISOR_shared_info; > + sync_set_bit(port, &s->evtchn_pending[0]); > +} > + > + > +/** > + * notify_remote_via_irq - send event to remote end of event channel > via irq > + * @irq: irq of event channel to send event to > + * > + * Unlike notify_remote_via_evtchn(), this is safe to use across > + * save/restore. Notifications on a broken connection are silently > + * dropped. > + */ > +void notify_remote_via_irq(int irq) > +{ > + int evtchn = evtchn_from_irq(irq); > + > + if (VALID_EVTCHN(evtchn)) > + notify_remote_via_evtchn(evtchn); > +} > +EXPORT_SYMBOL_GPL(notify_remote_via_irq); > + > +static void mask_evtchn(int port) > +{ > + struct shared_info *s = HYPERVISOR_shared_info; > + sync_set_bit(port, &s->evtchn_mask[0]); > +} > + > +static void unmask_evtchn(int port) > +{ > + struct shared_info *s = HYPERVISOR_shared_info; > + unsigned int cpu = get_cpu(); > + > + BUG_ON(!irqs_disabled()); > + > + /* Slow path (hypercall) if this is a non-local port. */ > + if (unlikely(cpu != cpu_from_evtchn(port))) { > + struct evtchn_unmask unmask = { .port = port }; > + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, > &unmask); > + } else { > + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); > + > + sync_clear_bit(port, &s->evtchn_mask[0]); > + > + /* > + * The following is basically the equivalent of > + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose > + * the interrupt edge' if the channel is masked. > + */ > + if (sync_test_bit(port, &s->evtchn_pending[0]) && > + !sync_test_and_set_bit(port / BITS_PER_LONG, > + > &vcpu_info->evtchn_pending_sel)) > + vcpu_info->evtchn_upcall_pending = 1; > + } > + > + put_cpu(); > +} > + > +static int find_unbound_irq(void) > +{ > + int irq; > + > + /* Only allocate from dynirq range */ > + for (irq = 0; irq < NR_IRQS; irq++) > + if (irq_bindcount[irq] = 0) > + break; > + > + if (irq = NR_IRQS) > + panic("No available IRQ to bind to: increase > NR_IRQS!\n"); > + > + return irq; > +} > + > +int bind_evtchn_to_irq(unsigned int evtchn) > +{ > + int irq; > + > + spin_lock(&irq_mapping_update_lock); > + > + irq = evtchn_to_irq[evtchn]; > + > + if (irq = -1) { > + irq = find_unbound_irq(); > + > + dynamic_irq_init(irq); > + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > + handle_level_irq, > "event"); > + > + evtchn_to_irq[evtchn] = irq; > + irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn); > + } > + > + irq_bindcount[irq]++; > + > + spin_unlock(&irq_mapping_update_lock); > + > + return irq; > +} > +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); > + > +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) > +{ > + struct evtchn_bind_ipi bind_ipi; > + int evtchn, irq; > + > + spin_lock(&irq_mapping_update_lock); > + > + irq = per_cpu(ipi_to_irq, cpu)[ipi]; > + if (irq = -1) { > + irq = find_unbound_irq(); > + if (irq < 0) > + goto out; > + > + dynamic_irq_init(irq); > + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > + handle_level_irq, "ipi"); > + > + bind_ipi.vcpu = cpu; > + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, > + &bind_ipi) != 0) > + BUG(); > + evtchn = bind_ipi.port; > + > + evtchn_to_irq[evtchn] = irq; > + irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); > + > + per_cpu(ipi_to_irq, cpu)[ipi] = irq; > + > + bind_evtchn_to_cpu(evtchn, cpu); > + } > + > + irq_bindcount[irq]++; > + > + out: > + spin_unlock(&irq_mapping_update_lock); > + return irq; > +} > + > + > +static int bind_virq_to_irq(unsigned int virq, unsigned int cpu) > +{ > + struct evtchn_bind_virq bind_virq; > + int evtchn, irq; > + > + spin_lock(&irq_mapping_update_lock); > + > + irq = per_cpu(virq_to_irq, cpu)[virq]; > + > + if (irq = -1) { > + bind_virq.virq = virq; > + bind_virq.vcpu = cpu; > + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, > + &bind_virq) != 0) > + BUG(); > + evtchn = bind_virq.port; > + > + irq = find_unbound_irq(); > + > + dynamic_irq_init(irq); > + set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, > + handle_level_irq, "virq"); > + > + evtchn_to_irq[evtchn] = irq; > + irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); > + > + per_cpu(virq_to_irq, cpu)[virq] = irq; > + > + bind_evtchn_to_cpu(evtchn, cpu); > + } > + > + irq_bindcount[irq]++; > + > + spin_unlock(&irq_mapping_update_lock); > + > + return irq; > +} > + > +static void unbind_from_irq(unsigned int irq) > +{ > + struct evtchn_close close; > + int evtchn = evtchn_from_irq(irq); > + > + spin_lock(&irq_mapping_update_lock); > + > + if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] = 0)) { > + close.port = evtchn; > + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) > != 0) > + BUG(); > + > + switch (type_from_irq(irq)) { > + case IRQT_VIRQ: > + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) > + [index_from_irq(irq)] = -1; > + break; > + default: > + break; > + } > + > + /* Closed ports are implicitly re-bound to VCPU0. */ > + bind_evtchn_to_cpu(evtchn, 0); > + > + evtchn_to_irq[evtchn] = -1; > + irq_info[irq] = IRQ_UNBOUND; > + > + dynamic_irq_init(irq); > + } > + > + spin_unlock(&irq_mapping_update_lock); > +} > + > +int bind_evtchn_to_irqhandler(unsigned int evtchn, > + irq_handler_t handler, > + unsigned long irqflags, > + const char *devname, void *dev_id) > +{ > + unsigned int irq; > + int retval; > + > + irq = bind_evtchn_to_irq(evtchn); > + retval = request_irq(irq, handler, irqflags, devname, dev_id); > + if (retval != 0) { > + unbind_from_irq(irq); > + return retval; > + } > + > + return irq; > +} > +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); > + > +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, > + irq_handler_t handler, > + unsigned long irqflags, const char *devname, > void *dev_id) > +{ > + unsigned int irq; > + int retval; > + > + irq = bind_virq_to_irq(virq, cpu); > + retval = request_irq(irq, handler, irqflags, devname, dev_id); > + if (retval != 0) { > + unbind_from_irq(irq); > + return retval; > + } > + > + return irq; > +} > +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); > + > +int bind_ipi_to_irqhandler(enum ipi_vector ipi, > + unsigned int cpu, > + irq_handler_t handler, > + unsigned long irqflags, > + const char *devname, > + void *dev_id) > +{ > + int irq, retval; > + > + irq = bind_ipi_to_irq(ipi, cpu); > + if (irq < 0) > + return irq; > + > + retval = request_irq(irq, handler, irqflags, devname, dev_id); > + if (retval != 0) { > + unbind_from_irq(irq); > + return retval; > + } > + > + return irq; > +} > + > +void unbind_from_irqhandler(unsigned int irq, void *dev_id) > +{ > + free_irq(irq, dev_id); > + unbind_from_irq(irq); > +} > +EXPORT_SYMBOL_GPL(unbind_from_irqhandler); > + > +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) > +{ > + int irq = per_cpu(ipi_to_irq, cpu)[vector]; > + BUG_ON(irq < 0); > + notify_remote_via_irq(irq); > +} > + > + > +/* > + * Search the CPUs pending events bitmasks. For each one found, map > + * the event number to an irq, and feed it into do_IRQ() for > + * handling. > + * > + * Xen uses a two-level bitmap to speed searching. The first level is > + * a bitset of words which contain pending event bits. The second > + * level is a bitset of pending events themselves. > + */ > +void xen_evtchn_do_upcall(struct pt_regs *regs) > +{ > + int cpu = get_cpu(); > + struct shared_info *s = HYPERVISOR_shared_info; > + struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu); > + unsigned long pending_words; > + > + vcpu_info->evtchn_upcall_pending = 0; > + > + /* NB. No need for a barrier here -- XCHG is a barrier on x86. > */ > + pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0); > + while (pending_words != 0) { > + unsigned long pending_bits; > + int word_idx = __ffs(pending_words); > + pending_words &= ~(1UL << word_idx); > + > + while ((pending_bits = active_evtchns(cpu, s, word_idx)) > != 0) { > + int bit_idx = __ffs(pending_bits); > + int port = (word_idx * BITS_PER_LONG) + bit_idx; > + int irq = evtchn_to_irq[port]; > + > + if (irq != -1) { > + regs->orig_ax = ~irq; > + do_IRQ(regs); > + } > + } > + } > + > + put_cpu(); > +} > + > +/* Rebind an evtchn so that it gets delivered to a specific cpu */ > +static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu) > +{ > + struct evtchn_bind_vcpu bind_vcpu; > + int evtchn = evtchn_from_irq(irq); > + > + if (!VALID_EVTCHN(evtchn)) > + return; > + > + /* Send future instances of this interrupt to other vcpu. */ > + bind_vcpu.port = evtchn; > + bind_vcpu.vcpu = tcpu; > + > + /* > + * If this fails, it usually just indicates that we're dealing > with a > + * virq or IPI channel, which don't actually need to be rebound. > Ignore > + * it, but don't do the xenlinux-level rebind in that case. > + */ > + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) > >> = 0) >> > + bind_evtchn_to_cpu(evtchn, tcpu); > +} > + > + > +static void set_affinity_irq(unsigned irq, cpumask_t dest) > +{ > + unsigned tcpu = first_cpu(dest); > + rebind_irq_to_cpu(irq, tcpu); > +} > + > +static void enable_dynirq(unsigned int irq) > +{ > + int evtchn = evtchn_from_irq(irq); > + > + if (VALID_EVTCHN(evtchn)) > + unmask_evtchn(evtchn); > +} > + > +static void disable_dynirq(unsigned int irq) > +{ > + int evtchn = evtchn_from_irq(irq); > + > + if (VALID_EVTCHN(evtchn)) > + mask_evtchn(evtchn); > +} > + > +static void ack_dynirq(unsigned int irq) > +{ > + int evtchn = evtchn_from_irq(irq); > + > + move_native_irq(irq); > + > + if (VALID_EVTCHN(evtchn)) > + clear_evtchn(evtchn); > +} > + > +static int retrigger_dynirq(unsigned int irq) > +{ > + int evtchn = evtchn_from_irq(irq); > + int ret = 0; > + > + if (VALID_EVTCHN(evtchn)) { > + set_evtchn(evtchn); > + ret = 1; > + } > + > + return ret; > +} > + > +static struct irq_chip xen_dynamic_chip __read_mostly = { > + .name = "xen-dyn", > + .mask = disable_dynirq, > + .unmask = enable_dynirq, > + .ack = ack_dynirq, > + .set_affinity = set_affinity_irq, > + .retrigger = retrigger_dynirq, > +}; > + > +void __init xen_init_IRQ(void) > +{ > + int i; > + > + init_evtchn_cpu_bindings(); > + > + /* No event channels are 'live' right now. */ > + for (i = 0; i < NR_EVENT_CHANNELS; i++) > + mask_evtchn(i); > + > + /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ > + for (i = 0; i < NR_IRQS; i++) > + irq_bindcount[i] = 0; > + > + irq_ctx_init(smp_processor_id()); > +} > diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile > --- old/drivers/xen/Makefile 2008-03-10 13:22:27.000000000 +0800 > +++ linux/drivers/xen/Makefile 2008-03-25 13:56:41.368764287 +0800 > @@ -1,2 +1,2 @@ > -obj-y += grant-table.o > +obj-y += grant-table.o events.o > obj-y += xenbus/ > diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h > --- old/include/xen/xen-ops.h 1970-01-01 08:00:00.000000000 +0800 > +++ linux/include/xen/xen-ops.h 2008-03-25 14:00:09.041321546 +0800 > @@ -0,0 +1,6 @@ > +#ifndef INCLUDE_XEN_OPS_H > +#define INCLUDE_XEN_OPS_H > + > +DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); > This should include <linux/percpu.h> rather than assuming it has already been included. J ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-03-25 15:08 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-05 18:18 [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization Isaku Yamahata 2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata 2008-03-20 9:13 ` Xen common code across architecture Dong, Eddie 2008-03-20 14:23 ` Jeremy Fitzhardinge 2008-03-25 6:13 ` Dong, Eddie 2008-03-25 6:35 ` Dong, Eddie 2008-03-25 15:08 ` Jeremy Fitzhardinge
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox