[PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization

public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization
@ 2008-03-05 18:18 Isaku Yamahata
  2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata
  0 siblings, 1 reply; 7+ messages in thread
From: Isaku Yamahata @ 2008-03-05 18:18 UTC (permalink / raw)
  To: linux-ia64

Hi. This patchset implements xen/ia64 domU support.
Qing He and Eddie Dong also has been woring on pv_ops so that
I want to discuss before going further and avoid duplicated work.
I suppose that Eddie will also post his own patch. So reviewing both
patches, we can reach to better pv_ops interface.


- I didn't changed the ia64 intrinsic paravirtulization abi from
  the last post. Presumably it would be better to discuss with
  the Eddie's patch.

- I implemented The basic portion of domU pv_ops.
  They may need the interface refinement. Probably Eddie has
  his own opinion.

- This time I dropped the patches which hasn't been pv_ops'fied yet 
  because they aren't changed from the last post.

You can also get the full source from
http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/
branch: xen-ia64-2008mar06

The patchset are organized as follows
- xen arch portability.
  Generalizes x86 xen patches for ia64 support.
- some preliminary patches.
  Make kernel paravirtualization friendly.
- introduce pv_ops and the definitions for native.
- basic helper functions for xen ia64 support.
- introduce the pv_ops instance for xen/ia64.


TODO:
- discuss and define intrinsic paravirtualization abi.
- discuss pv_ops.
- more pv_ops for domU
  - mca/sal call
  - timer
  - gate page
  - fsys
- support save/restore/live migration
- more clean ups
  - remove unnecessary if (is_running_on_xen()).
- Free xen_ivt areas somehow. No waste kernel space
  From Keith Owens idea.
  Probably after defining ABI because this is just optimization.
- dom0
  consider after finishing domU/ia64 merge.


Changes from take 2:
- many clean ups following to comments.
- clean up:assembly instruction macro.
- introduced pv_ops: pv_info, pv_init_ops, pv_iosapic_ops, pv_irq_ops.

Changes from take 1:
Single IVT source code. compile multitimes using assembler macros.

thanks,

Diffstat:
 arch/ia64/Kconfig                                  |   72 +++
 arch/ia64/kernel/Makefile                          |   30 +-
 arch/ia64/kernel/acpi.c                            |    4 +
 arch/ia64/kernel/asm-offsets.c                     |   25 +
 arch/ia64/kernel/entry.S                           |  568 +------------------
 arch/ia64/kernel/head.S                            |    6 +
 arch/ia64/kernel/inst_native.h                     |  183 ++++++
 arch/ia64/kernel/inst_paravirt.h                   |   28 +
 arch/ia64/kernel/iosapic.c                         |   43 +-
 arch/ia64/kernel/irq_ia64.c                        |   21 +-
 arch/ia64/kernel/ivt.S                             |  153 +++---
 arch/ia64/kernel/minstate.h                        |   10 +-
 arch/ia64/kernel/module.c                          |   32 +
 arch/ia64/kernel/pal.S                             |    5 +-
 arch/ia64/kernel/paravirt.c                        |   94 +++
 arch/ia64/kernel/paravirt_alt.c                    |  118 ++++
 arch/ia64/kernel/paravirt_core.c                   |  201 +++++++
 arch/ia64/kernel/paravirt_entry.c                  |   99 ++++
 arch/ia64/kernel/paravirt_nop.c                    |   49 ++
 arch/ia64/kernel/paravirtentry.S                   |   37 ++
 arch/ia64/kernel/setup.c                           |   14 +
 arch/ia64/kernel/smpboot.c                         |    2 +
 arch/ia64/kernel/switch_leave.S                    |  603 +++++++++++++++++++
 arch/ia64/kernel/vmlinux.lds.S                     |   35 ++
 arch/ia64/xen/Makefile                             |    9 +
 arch/ia64/xen/hypercall.S                          |  141 +++++
 arch/ia64/xen/hypervisor.c                         |  235 ++++++++
 arch/ia64/xen/inst_xen.h                           |  503 ++++++++++++++++
 arch/ia64/xen/irq_xen.c                            |  435 ++++++++++++++
 arch/ia64/xen/irq_xen.h                            |    8 +
 arch/ia64/xen/machvec.c                            |    4 +
 arch/ia64/xen/paravirt_xen.c                       |  242 ++++++++
 arch/ia64/xen/privops_asm.S                        |  221 +++++++
 arch/ia64/xen/privops_c.c                          |  279 +++++++++
 arch/ia64/xen/util.c                               |  101 ++++
 arch/ia64/xen/xcom_asm.S                           |   27 +
 arch/ia64/xen/xcom_hcall.c                         |  458 +++++++++++++++
 arch/ia64/xen/xen_pv_ops.c                         |  319 ++++++++++
 arch/ia64/xen/xencomm.c                            |  108 ++++
 arch/ia64/xen/xenivt.S                             |   59 ++
 arch/ia64/{kernel/minstate.h => xen/xenminstate.h} |   96 +---
 arch/ia64/xen/xenpal.S                             |   76 +++
 arch/ia64/xen/xensetup.S                           |   60 ++
 arch/x86/xen/Makefile                              |    4 +-
 arch/x86/xen/grant-table.c                         |   91 +++
 arch/x86/xen/xen-ops.h                             |    2 +-
 drivers/xen/Makefile                               |    3 +-
 {arch/x86 => drivers}/xen/events.c                 |   33 +-
 {arch/x86 => drivers}/xen/features.c               |    0 
 drivers/xen/grant-table.c                          |   37 +--
 drivers/xen/xenbus/xenbus_client.c                 |    6 +-
 drivers/xen/xencomm.c                              |  232 ++++++++
 include/asm-ia64/gcc_intrin.h                      |   58 +-
 include/asm-ia64/hw_irq.h                          |   24 +-
 include/asm-ia64/intel_intrin.h                    |   64 +-
 include/asm-ia64/intrinsics.h                      |   12 +
 include/asm-ia64/iosapic.h                         |   18 +-
 include/asm-ia64/irq.h                             |   33 ++
 include/asm-ia64/machvec.h                         |    2 +
 include/asm-ia64/machvec_xen.h                     |   22 +
 include/asm-ia64/meminit.h                         |    3 +-
 include/asm-ia64/mmu_context.h                     |    6 +-
 include/asm-ia64/module.h                          |    6 +
 include/asm-ia64/page.h                            |    8 +
 include/asm-ia64/paravirt.h                        |  284 +++++++++
 include/asm-ia64/paravirt_alt.h                    |   82 +++
 include/asm-ia64/paravirt_core.h                   |   54 ++
 include/asm-ia64/paravirt_entry.h                  |   62 ++
 include/asm-ia64/paravirt_nop.h                    |   46 ++
 include/asm-ia64/privop.h                          |   67 +++
 include/asm-ia64/privop_paravirt.h                 |  587 +++++++++++++++++++
 include/asm-ia64/sync_bitops.h                     |   59 ++
 include/asm-ia64/system.h                          |    4 +-
 include/asm-ia64/xen/hypercall.h                   |  426 ++++++++++++++
 include/asm-ia64/xen/hypervisor.h                  |  249 ++++++++
 include/asm-ia64/xen/interface.h                   |  585 +++++++++++++++++++
 include/asm-ia64/xen/page.h                        |   41 ++
 include/asm-ia64/xen/privop.h                      |  609 ++++++++++++++++++++
 include/asm-ia64/xen/xcom_hcall.h                  |   55 ++
 include/asm-ia64/xen/xencomm.h                     |   33 ++
 include/asm-x86/xen/hypervisor.h                   |   10 +
 include/asm-x86/xen/interface.h                    |   24 +
 include/{ => asm-x86}/xen/page.h                   |    0 
 include/xen/events.h                               |    1 +
 include/xen/grant_table.h                          |    6 +
 include/xen/interface/callback.h                   |  119 ++++
 include/xen/interface/grant_table.h                |   11 +-
 include/xen/interface/vcpu.h                       |    5 +
 include/xen/interface/xen.h                        |   22 +-
 include/xen/interface/xencomm.h                    |   41 ++
 include/xen/page.h                                 |  181 +------
 include/xen/xen-ops.h                              |    6 +
 include/xen/xencomm.h                              |   77 +++
 93 files changed, 9174 insertions(+), 1049 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops.
@ 2008-03-05 18:19 ` Isaku Yamahata
  2008-03-20  9:13   ` Xen common code across architecture Dong, Eddie
  0 siblings, 1 reply; 7+ messages in thread
From: Isaku Yamahata @ 2008-03-05 18:19 UTC (permalink / raw)
  To: linux-ia64


Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
---
 arch/ia64/xen/Makefile     |    2 +-
 arch/ia64/xen/hypercall.S  |   10 +
 arch/ia64/xen/irq_xen.c    |  435 ++++++++++++++++++++++++++++++++++++++++++++
 arch/ia64/xen/irq_xen.h    |    8 +
 arch/ia64/xen/xen_pv_ops.c |    3 +
 include/asm-ia64/hw_irq.h  |    4 +
 include/asm-ia64/irq.h     |   33 ++++
 7 files changed, 494 insertions(+), 1 deletions(-)
 create mode 100644 arch/ia64/xen/irq_xen.c
 create mode 100644 arch/ia64/xen/irq_xen.h

diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index 4b1db56..ff7a58d 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -2,7 +2,7 @@
 # Makefile for Xen components
 #
 
-obj-y := xen_pv_ops.o
+obj-y := xen_pv_ops.o irq_xen.o 
 
 obj-$(CONFIG_PARAVIRT_ALT) += paravirt_xen.o privops_asm.o privops_c.o
 obj-$(CONFIG_PARAVIRT_NOP_B_PATCH) += paravirt_xen.o
diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S
index 7c5242b..3fad2fe 100644
--- a/arch/ia64/xen/hypercall.S
+++ b/arch/ia64/xen/hypercall.S
@@ -123,6 +123,16 @@ END(xen_set_eflag)
 #endif /* CONFIG_IA32_SUPPORT */
 #endif /* ASM_SUPPORTED */
 
+GLOBAL_ENTRY(xen_send_ipi)
+	mov r14=r32
+	mov r15=r33
+	mov r2=0x400
+	break 0x1000
+	;;
+	br.ret.sptk.many rp
+	;;
+END(xen_send_ipi)
+
 GLOBAL_ENTRY(__hypercall)
 	mov r2=r37
 	break 0x1000
diff --git a/arch/ia64/xen/irq_xen.c b/arch/ia64/xen/irq_xen.c
new file mode 100644
index 0000000..57fab2b
--- /dev/null
+++ b/arch/ia64/xen/irq_xen.c
@@ -0,0 +1,435 @@
+/******************************************************************************
+ * arch/ia64/xen/irq_xen.c
+ *
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include <linux/cpu.h>
+
+#include <xen/events.h>
+#include <xen/interface/callback.h>
+
+#include "irq_xen.h"
+
+/***************************************************************************
+ * pv_irq_ops
+ * irq operations
+ */
+
+static int
+xen_assign_irq_vector(int irq)
+{
+	struct physdev_irq irq_op;
+
+	irq_op.irq = irq;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op))
+		return -ENOSPC;
+
+	return irq_op.vector;
+}
+
+static void
+xen_free_irq_vector(int vector)
+{
+	struct physdev_irq irq_op;
+
+	if (vector < IA64_FIRST_DEVICE_VECTOR ||
+	    vector > IA64_LAST_DEVICE_VECTOR)
+		return;
+
+	irq_op.vector = vector;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_free_irq_vector, &irq_op))
+		printk(KERN_WARNING "%s: xen_free_irq_vecotr fail vector=%d\n",
+		       __func__, vector);
+}
+
+
+static DEFINE_PER_CPU(int, timer_irq) = -1;
+static DEFINE_PER_CPU(int, ipi_irq) = -1;
+static DEFINE_PER_CPU(int, resched_irq) = -1;
+static DEFINE_PER_CPU(int, cmc_irq) = -1;
+static DEFINE_PER_CPU(int, cmcp_irq) = -1;
+static DEFINE_PER_CPU(int, cpep_irq) = -1;
+#define NAME_SIZE	15
+static DEFINE_PER_CPU(char[NAME_SIZE], timer_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], ipi_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], resched_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cmc_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cmcp_name);
+static DEFINE_PER_CPU(char[NAME_SIZE], cpep_name);
+#undef NAME_SIZE
+
+struct saved_irq {
+	unsigned int irq;
+	struct irqaction *action;
+};
+/* 16 should be far optimistic value, since only several percpu irqs
+ * are registered early.
+ */
+#define MAX_LATE_IRQ	16
+static struct saved_irq saved_percpu_irqs[MAX_LATE_IRQ];
+static unsigned short late_irq_cnt = 0;
+static unsigned short saved_irq_cnt = 0;
+static int xen_slab_ready = 0;
+
+#ifdef CONFIG_SMP
+/* Dummy stub. Though we may check RESCHEDULE_VECTOR before __do_IRQ,
+ * it ends up to issue several memory accesses upon percpu data and
+ * thus adds unnecessary traffic to other paths.
+ */
+static irqreturn_t
+xen_dummy_handler(int irq, void *dev_id)
+{
+
+	return IRQ_HANDLED;
+}
+
+static struct irqaction xen_resched_irqaction = {
+	.handler =	xen_dummy_handler,
+	.flags =	IRQF_DISABLED,
+	.name =		"resched"
+};
+
+static struct irqaction xen_tlb_irqaction = {
+	.handler =	xen_dummy_handler,
+	.flags =	IRQF_DISABLED,
+	.name =		"tlb_flush"
+};
+#endif
+
+/*
+ * This is xen version percpu irq registration, which needs bind
+ * to xen specific evtchn sub-system. One trick here is that xen
+ * evtchn binding interface depends on kmalloc because related
+ * port needs to be freed at device/cpu down. So we cache the
+ * registration on BSP before slab is ready and then deal them
+ * at later point. For rest instances happening after slab ready,
+ * we hook them to xen evtchn immediately.
+ *
+ * FIXME: MCA is not supported by far, and thus "nomca" boot param is
+ * required.
+ */
+static void
+__xen_register_percpu_irq(unsigned int cpu, unsigned int vec,
+			struct irqaction *action, int save)
+{
+	irq_desc_t *desc;
+	int irq = 0;
+
+	if (xen_slab_ready) {
+		switch (vec) {
+		case IA64_TIMER_VECTOR:
+			snprintf(per_cpu(timer_name, cpu),
+				 sizeof(per_cpu(timer_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_virq_to_irqhandler(VIRQ_ITC, cpu,
+				action->handler, action->flags,
+				per_cpu(timer_name, cpu), action->dev_id);
+			per_cpu(timer_irq, cpu) = irq;
+			break;
+		case IA64_IPI_RESCHEDULE:
+			snprintf(per_cpu(resched_name, cpu),
+				 sizeof(per_cpu(resched_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(RESCHEDULE_VECTOR, cpu,
+				action->handler, action->flags,
+				per_cpu(resched_name, cpu), action->dev_id);
+			per_cpu(resched_irq, cpu) = irq;
+			break;
+		case IA64_IPI_VECTOR:
+			snprintf(per_cpu(ipi_name, cpu),
+				 sizeof(per_cpu(ipi_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(IPI_VECTOR, cpu,
+				action->handler, action->flags,
+				per_cpu(ipi_name, cpu), action->dev_id);
+			per_cpu(ipi_irq, cpu) = irq;
+			break;
+		case IA64_CMC_VECTOR:
+			snprintf(per_cpu(cmc_name, cpu),
+				 sizeof(per_cpu(cmc_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_virq_to_irqhandler(VIRQ_MCA_CMC, cpu,
+						      action->handler,
+						      action->flags,
+						      per_cpu(cmc_name, cpu),
+						      action->dev_id);
+			per_cpu(cmc_irq, cpu) = irq;
+			break;
+		case IA64_CMCP_VECTOR:
+			snprintf(per_cpu(cmcp_name, cpu),
+				 sizeof(per_cpu(cmcp_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(CMCP_VECTOR, cpu,
+						     action->handler,
+						     action->flags,
+						     per_cpu(cmcp_name, cpu),
+						     action->dev_id);
+			per_cpu(cmcp_irq, cpu) = irq;
+			break;
+		case IA64_CPEP_VECTOR:
+			snprintf(per_cpu(cpep_name, cpu),
+				 sizeof(per_cpu(cpep_name, cpu)),
+				 "%s%d", action->name, cpu);
+			irq = bind_ipi_to_irqhandler(CPEP_VECTOR, cpu,
+						     action->handler,
+						     action->flags,
+						     per_cpu(cpep_name, cpu),
+						     action->dev_id);
+			per_cpu(cpep_irq, cpu) = irq;
+			break;
+		case IA64_CPE_VECTOR:
+		case IA64_MCA_RENDEZ_VECTOR:
+		case IA64_PERFMON_VECTOR:
+		case IA64_MCA_WAKEUP_VECTOR:
+		case IA64_SPURIOUS_INT_VECTOR:
+			/* No need to complain, these aren't supported. */
+			break;
+		default:
+			printk(KERN_WARNING "Percpu irq %d is unsupported "
+			       "by xen!\n", vec);
+			break;
+		}
+		BUG_ON(irq < 0);
+
+		if (irq > 0) {
+			/*
+			 * Mark percpu.  Without this, migrate_irqs() will
+			 * mark the interrupt for migrations and trigger it
+			 * on cpu hotplug.
+			 */
+			desc = irq_desc + irq;
+			desc->status |= IRQ_PER_CPU;
+		}
+	}
+
+	/* For BSP, we cache registered percpu irqs, and then re-walk
+	 * them when initializing APs
+	 */
+	if (!cpu && save) {
+		BUG_ON(saved_irq_cnt = MAX_LATE_IRQ);
+		saved_percpu_irqs[saved_irq_cnt].irq = vec;
+		saved_percpu_irqs[saved_irq_cnt].action = action;
+		saved_irq_cnt++;
+		if (!xen_slab_ready)
+			late_irq_cnt++;
+	}
+}
+
+static void
+xen_register_percpu_irq(ia64_vector vec, struct irqaction *action)
+{
+	__xen_register_percpu_irq(smp_processor_id(), vec, action, 1);
+}
+
+static void
+xen_bind_early_percpu_irq(void)
+{
+	int i;
+
+	xen_slab_ready = 1;
+	/* There's no race when accessing this cached array, since only
+	 * BSP will face with such step shortly
+	 */
+	for (i = 0; i < late_irq_cnt; i++)
+		__xen_register_percpu_irq(smp_processor_id(),
+					  saved_percpu_irqs[i].irq,
+					  saved_percpu_irqs[i].action, 0);
+}
+
+/* FIXME: There's no obvious point to check whether slab is ready. So
+ * a hack is used here by utilizing a late time hook.
+ */
+extern void (*late_time_init)(void);
+extern char xen_event_callback;
+extern void xen_init_IRQ(void);
+
+#ifdef CONFIG_HOTPLUG_CPU
+static int __devinit
+unbind_evtchn_callback(struct notifier_block *nfb,
+		       unsigned long action, void *hcpu)
+{
+	unsigned int cpu = (unsigned long)hcpu;
+
+	if (action = CPU_DEAD) {
+		/* Unregister evtchn.  */
+		if (per_cpu(cpep_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cpep_irq, cpu), NULL);
+			per_cpu(cpep_irq, cpu) = -1;
+		}
+		if (per_cpu(cmcp_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cmcp_irq, cpu), NULL);
+			per_cpu(cmcp_irq, cpu) = -1;
+		}
+		if (per_cpu(cmc_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(cmc_irq, cpu), NULL);
+			per_cpu(cmc_irq, cpu) = -1;
+		}
+		if (per_cpu(ipi_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(ipi_irq, cpu), NULL);
+			per_cpu(ipi_irq, cpu) = -1;
+		}
+		if (per_cpu(resched_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(resched_irq, cpu),
+						NULL);
+			per_cpu(resched_irq, cpu) = -1;
+		}
+		if (per_cpu(timer_irq, cpu) >= 0) {
+			unbind_from_irqhandler(per_cpu(timer_irq, cpu), NULL);
+			per_cpu(timer_irq, cpu) = -1;
+		}
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block unbind_evtchn_notifier = {
+	.notifier_call = unbind_evtchn_callback,
+	.priority = 0
+};
+#endif
+
+DECLARE_PER_CPU(int, ipi_to_irq[NR_IPIS]);
+void xen_smp_intr_init_early(unsigned int cpu)
+{
+#ifdef CONFIG_SMP
+	unsigned int i;
+
+	for (i = 0; i < saved_irq_cnt; i++)
+		__xen_register_percpu_irq(cpu, saved_percpu_irqs[i].irq,
+					  saved_percpu_irqs[i].action, 0);
+#endif
+}
+
+void xen_smp_intr_init(void)
+{
+#ifdef CONFIG_SMP
+	unsigned int cpu = smp_processor_id();
+	struct callback_register event = {
+		.type = CALLBACKTYPE_event,
+		.address = (unsigned long)&xen_event_callback,
+	};
+
+	if (cpu = 0) {
+		/* Initialization was already done for boot cpu.  */
+#ifdef CONFIG_HOTPLUG_CPU
+		/* Register the notifier only once.  */
+		register_cpu_notifier(&unbind_evtchn_notifier);
+#endif
+		return;
+	}
+
+	/* This should be piggyback when setup vcpu guest context */
+	BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event));
+#endif /* CONFIG_SMP */
+}
+
+void __init
+xen_irq_init(void)
+{
+	struct callback_register event = {
+		.type = CALLBACKTYPE_event,
+		.address = (unsigned long)&xen_event_callback,
+	};
+
+	xen_init_IRQ();
+	BUG_ON(HYPERVISOR_callback_op(CALLBACKOP_register, &event));
+	late_time_init = xen_bind_early_percpu_irq;
+}
+
+void
+xen_platform_send_ipi(int cpu, int vector, int delivery_mode, int redirect)
+{
+	int irq = -1;
+
+#ifdef CONFIG_SMP
+	/* TODO: we need to call vcpu_up here */
+	if (unlikely(vector = ap_wakeup_vector)) {
+		/* XXX
+		 * This should be in __cpu_up(cpu) in ia64 smpboot.c
+		 * like x86. But don't want to modify it,
+		 * keep it untouched.
+		 */
+		xen_smp_intr_init_early(cpu);
+
+		xen_send_ipi(cpu, vector);
+		/* vcpu_prepare_and_up(cpu); */
+		return;
+	}
+#endif
+
+	switch (vector) {
+	case IA64_IPI_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[IPI_VECTOR];
+		break;
+	case IA64_IPI_RESCHEDULE:
+		irq = per_cpu(ipi_to_irq, cpu)[RESCHEDULE_VECTOR];
+		break;
+	case IA64_CMCP_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[CMCP_VECTOR];
+		break;
+	case IA64_CPEP_VECTOR:
+		irq = per_cpu(ipi_to_irq, cpu)[CPEP_VECTOR];
+		break;
+	default:
+		printk(KERN_WARNING "Unsupported IPI type 0x%x\n",
+		       vector);
+		irq = 0;
+		break;
+	}
+
+	BUG_ON(irq < 0);
+	notify_remote_via_irq(irq);
+	return;
+}
+
+static void __init
+xen_init_IRQ_early(void)
+{
+#ifdef CONFIG_SMP
+	register_percpu_irq(IA64_IPI_RESCHEDULE, &xen_resched_irqaction);
+	register_percpu_irq(IA64_IPI_LOCAL_TLB_FLUSH, &xen_tlb_irqaction);
+#endif
+}
+
+static void __init
+xen_init_IRQ_late(void)
+{
+#ifdef CONFIG_XEN_PRIVILEGED_GUEST
+	if (is_running_on_xen() && !ia64_platform_is("xen"))
+		xen_irq_init();
+#endif
+}
+
+static void
+xen_resend_irq(unsigned int vector)
+{
+	(void)resend_irq_on_evtchn(vector);
+}
+
+const struct pv_irq_ops xen_irq_ops __initdata = {
+	.init_IRQ_early = xen_init_IRQ_early,
+	.init_IRQ_late = xen_init_IRQ_late,
+
+	.assign_irq_vector = xen_assign_irq_vector,
+	.free_irq_vector = xen_free_irq_vector,
+	.register_percpu_irq = xen_register_percpu_irq,
+
+	.send_ipi = xen_platform_send_ipi,
+	.resend_irq = xen_resend_irq,
+};
diff --git a/arch/ia64/xen/irq_xen.h b/arch/ia64/xen/irq_xen.h
new file mode 100644
index 0000000..a2c3ed9
--- /dev/null
+++ b/arch/ia64/xen/irq_xen.h
@@ -0,0 +1,8 @@
+#ifndef IRQ_XEN_H
+#define IRQ_XEN_H
+
+extern const struct pv_irq_ops xen_irq_ops __initdata;
+extern void xen_smp_intr_init(void);
+extern void xen_send_ipi(int cpu, int vec);
+
+#endif /* IRQ_XEN_H */
diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index c35bb23..93a5c64 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -35,6 +35,8 @@
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/xencomm.h>
 
+#include "irq_xen.h"
+
 /***************************************************************************
  * general info
  */
@@ -313,4 +315,5 @@ xen_setup_pv_ops(void)
 	pv_info = xen_info;
 	pv_init_ops = xen_init_ops;
 	pv_iosapic_ops = xen_iosapic_ops;
+	pv_irq_ops = xen_irq_ops;
 }
diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
index 678efec..80009cd 100644
--- a/include/asm-ia64/hw_irq.h
+++ b/include/asm-ia64/hw_irq.h
@@ -15,7 +15,11 @@
 #include <asm/ptrace.h>
 #include <asm/smp.h>
 
+#ifndef CONFIG_XEN
 typedef u8 ia64_vector;
+#else
+typedef u16 ia64_vector;
+#endif
 
 /*
  * 0 special
diff --git a/include/asm-ia64/irq.h b/include/asm-ia64/irq.h
index a66d268..aead249 100644
--- a/include/asm-ia64/irq.h
+++ b/include/asm-ia64/irq.h
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 #include <linux/cpumask.h>
 
+#ifndef CONFIG_XEN
 #define NR_VECTORS	256
 
 #if (NR_VECTORS + 32 * NR_CPUS) < 1024
@@ -21,6 +22,38 @@
 #else
 #define NR_IRQS 1024
 #endif
+#else
+/*
+ * The flat IRQ space is divided into two regions:
+ *  1. A one-to-one mapping of real physical IRQs. This space is only used
+ *     if we have physical device-access privilege. This region is at the
+ *     start of the IRQ space so that existing device drivers do not need
+ *     to be modified to translate physical IRQ numbers into our IRQ space.
+ *  3. A dynamic mapping of inter-domain and Xen-sourced virtual IRQs. These
+ *     are bound using the provided bind/unbind functions.
+ */
+
+#define PIRQ_BASE		0
+#define NR_PIRQS		256
+
+#define DYNIRQ_BASE		(PIRQ_BASE + NR_PIRQS)
+#define NR_DYNIRQS		(CONFIG_NR_CPUS * 8)
+
+#define NR_IRQS			(NR_PIRQS + NR_DYNIRQS)
+#define NR_IRQ_VECTORS		NR_IRQS
+
+#define pirq_to_irq(_x)		((_x) + PIRQ_BASE)
+#define irq_to_pirq(_x)		((_x) - PIRQ_BASE)
+
+#define dynirq_to_irq(_x)	((_x) + DYNIRQ_BASE)
+#define irq_to_dynirq(_x)	((_x) - DYNIRQ_BASE)
+
+#define RESCHEDULE_VECTOR	0
+#define IPI_VECTOR		1
+#define CMCP_VECTOR		2
+#define CPEP_VECTOR		3
+#define NR_IPIS			4
+#endif /* CONFIG_XEN */
 
 static __inline__ int
 irq_canonicalize (int irq)
-- 
1.5.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Xen common code across architecture
  2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata
@ 2008-03-20  9:13   ` Dong, Eddie
  2008-03-20 14:23     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 7+ messages in thread
From: Dong, Eddie @ 2008-03-20  9:13 UTC (permalink / raw)
  To: virtualization, linux-kernel
  Cc: kvm-ia64-devel, xen-ia64-devel, linux-ia64, Andrew Morton

Jeremy & all:
	Current xen kernel codes are in arch/x86/xen, but xen dynamic
irqchip (events.c) are common for other architectures such as IA64. We
are in progress with enabling pv_ops for IA64 now and want to reuse same
code, do we need to move the code to some place common? suggestions?
	Thanks, eddie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Xen common code across architecture
  2008-03-20  9:13   ` Xen common code across architecture Dong, Eddie
@ 2008-03-20 14:23     ` Jeremy Fitzhardinge
  2008-03-25  6:13       ` Dong, Eddie
  0 siblings, 1 reply; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2008-03-20 14:23 UTC (permalink / raw)
  To: Dong, Eddie
  Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64,
	kvm-ia64-devel, xen-ia64-devel

Dong, Eddie wrote:
> 	Current xen kernel codes are in arch/x86/xen, but xen dynamic
> irqchip (events.c) are common for other architectures such as IA64. We
> are in progress with enabling pv_ops for IA64 now and want to reuse same
> code, do we need to move the code to some place common? suggestions?

I'm fine with moving common stuff like that to drivers/xen/.

    J

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Xen common code across architecture
  2008-03-20 14:23     ` Jeremy Fitzhardinge
@ 2008-03-25  6:13       ` Dong, Eddie
  2008-03-25  6:35         ` Dong, Eddie
  2008-03-25 15:08         ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 7+ messages in thread
From: Dong, Eddie @ 2008-03-25  6:13 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64,
	kvm-ia64-devel, xen-ia64-devel

[-- Attachment #1: Type: text/plain, Size: 32583 bytes --]

Jeremy/Andrew:

	Isaku Yamahata, I and some other IA64/Xen community memebers are

working together to enable pv_ops for IA64 Linux. This patch is a
preparation to 
move common arch/x86/xen/events.c to drivers/xen (contents are
identical) against
mm tree, it is based on Yamahata's IA64/pv_ops patch serie.
	In case you want to have a brief view of whole pv_ops/IA64 patch
serie, 
please refer to IA64 Linux mailinglist.

Thanks, Eddie
	

    Move events.c to drivers/xen for IA64/Xen
    support.


    Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>

diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c
--- old/arch/x86/xen/events.c	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/events.c	1970-01-01 08:00:00.000000000 +0800
@@ -1,591 +0,0 @@
-/*
- * Xen event channels
- *
- * Xen models interrupts with abstract event channels.  Because each
- * domain gets 1024 event channels, but NR_IRQ is not that large, we
- * must dynamically map irqs<->event channels.  The event channels
- * interface with the rest of the kernel by defining a xen interrupt
- * chip.  When an event is recieved, it is mapped to an irq and sent
- * through the normal interrupt processing path.
- *
- * There are four kinds of events which can be mapped to an event
- * channel:
- *
- * 1. Inter-domain notifications.  This includes all the virtual
- *    device events, since they're driven by front-ends in another
domain
- *    (typically dom0).
- * 2. VIRQs, typically used for timers.  These are per-cpu events.
- * 3. IPIs.
- * 4. Hardware interrupts. Not supported at present.
- *
- * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
- */
-
-#include <linux/linkage.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
-#include <linux/module.h>
-#include <linux/string.h>
-
-#include <asm/ptrace.h>
-#include <asm/irq.h>
-#include <asm/sync_bitops.h>
-#include <asm/xen/hypercall.h>
-#include <asm/xen/hypervisor.h>
-
-#include <xen/events.h>
-#include <xen/interface/xen.h>
-#include <xen/interface/event_channel.h>
-
-#include "xen-ops.h"
-
-/*
- * This lock protects updates to the following mapping and
reference-count
- * arrays. The lock does not need to be acquired to read the mapping
tables.
- */
-static DEFINE_SPINLOCK(irq_mapping_update_lock);
-
-/* IRQ <-> VIRQ mapping. */
-static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1]
= -1};
-
-/* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ...
XEN_NR_IPIS-1] = -1};
-
-/* Packed IRQ information: binding type, sub-type index, and event
channel. */
-struct packed_irq
-{
-	unsigned short evtchn;
-	unsigned char index;
-	unsigned char type;
-};
-
-static struct packed_irq irq_info[NR_IRQS];
-
-/* Binding types. */
-enum {
-	IRQT_UNBOUND,
-	IRQT_PIRQ,
-	IRQT_VIRQ,
-	IRQT_IPI,
-	IRQT_EVTCHN
-};
-
-/* Convenient shorthand for packed representation of an unbound IRQ. */
-#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
-
-static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
-	[0 ... NR_EVENT_CHANNELS-1] = -1
-};
-static unsigned long
cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
-static u8 cpu_evtchn[NR_EVENT_CHANNELS];
-
-/* Reference counts for bindings to IRQs. */
-static int irq_bindcount[NR_IRQS];
-
-/* Xen will never allocate port zero for any purpose. */
-#define VALID_EVTCHN(chn)	((chn) != 0)
-
-/*
- * Force a proper event-channel callback from Xen after clearing the
- * callback mask. We do this in a very simple manner, by making a call
- * down into Xen. The pending flag will be checked by Xen on return.
- */
-void force_evtchn_callback(void)
-{
-	(void)HYPERVISOR_xen_version(0, NULL);
-}
-EXPORT_SYMBOL_GPL(force_evtchn_callback);
-
-static struct irq_chip xen_dynamic_chip;
-
-/* Constructor for packed IRQ information. */
-static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32
evtchn)
-{
-	return (struct packed_irq) { evtchn, index, type };
-}
-
-/*
- * Accessors for packed IRQ information.
- */
-static inline unsigned int evtchn_from_irq(int irq)
-{
-	return irq_info[irq].evtchn;
-}
-
-static inline unsigned int index_from_irq(int irq)
-{
-	return irq_info[irq].index;
-}
-
-static inline unsigned int type_from_irq(int irq)
-{
-	return irq_info[irq].type;
-}
-
-static inline unsigned long active_evtchns(unsigned int cpu,
-					   struct shared_info *sh,
-					   unsigned int idx)
-{
-	return (sh->evtchn_pending[idx] &
-		cpu_evtchn_mask[cpu][idx] &
-		~sh->evtchn_mask[idx]);
-}
-
-static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
-{
-	int irq = evtchn_to_irq[chn];
-
-	BUG_ON(irq == -1);
-#ifdef CONFIG_SMP
-	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
-#endif
-
-	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
-	__set_bit(chn, cpu_evtchn_mask[cpu]);
-
-	cpu_evtchn[chn] = cpu;
-}
-
-static void init_evtchn_cpu_bindings(void)
-{
-#ifdef CONFIG_SMP
-	int i;
-	/* By default all event channels notify CPU#0. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_desc[i].affinity = cpumask_of_cpu(0);
-#endif
-
-	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
-	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
-}
-
-static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
-{
-	return cpu_evtchn[evtchn];
-}
-
-static inline void clear_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_clear_bit(port, &s->evtchn_pending[0]);
-}
-
-static inline void set_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_pending[0]);
-}
-
-
-/**
- * notify_remote_via_irq - send event to remote end of event channel
via irq
- * @irq: irq of event channel to send event to
- *
- * Unlike notify_remote_via_evtchn(), this is safe to use across
- * save/restore. Notifications on a broken connection are silently
- * dropped.
- */
-void notify_remote_via_irq(int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		notify_remote_via_evtchn(evtchn);
-}
-EXPORT_SYMBOL_GPL(notify_remote_via_irq);
-
-static void mask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_mask[0]);
-}
-
-static void unmask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	unsigned int cpu = get_cpu();
-
-	BUG_ON(!irqs_disabled());
-
-	/* Slow path (hypercall) if this is a non-local port. */
-	if (unlikely(cpu != cpu_from_evtchn(port))) {
-		struct evtchn_unmask unmask = { .port = port };
-		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask,
&unmask);
-	} else {
-		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-
-		sync_clear_bit(port, &s->evtchn_mask[0]);
-
-		/*
-		 * The following is basically the equivalent of
-		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
-		 * the interrupt edge' if the channel is masked.
-		 */
-		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
-		    !sync_test_and_set_bit(port / BITS_PER_LONG,
-
&vcpu_info->evtchn_pending_sel))
-			vcpu_info->evtchn_upcall_pending = 1;
-	}
-
-	put_cpu();
-}
-
-static int find_unbound_irq(void)
-{
-	int irq;
-
-	/* Only allocate from dynirq range */
-	for (irq = 0; irq < NR_IRQS; irq++)
-		if (irq_bindcount[irq] == 0)
-			break;
-
-	if (irq == NR_IRQS)
-		panic("No available IRQ to bind to: increase
NR_IRQS!\n");
-
-	return irq;
-}
-
-int bind_evtchn_to_irq(unsigned int evtchn)
-{
-	int irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = evtchn_to_irq[evtchn];
-
-	if (irq == -1) {
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq,
"event");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
-
-static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
-{
-	struct evtchn_bind_ipi bind_ipi;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(ipi_to_irq, cpu)[ipi];
-	if (irq == -1) {
-		irq = find_unbound_irq();
-		if (irq < 0)
-			goto out;
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "ipi");
-
-		bind_ipi.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
-						&bind_ipi) != 0)
-			BUG();
-		evtchn = bind_ipi.port;
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
-
-		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
- out:
-	spin_unlock(&irq_mapping_update_lock);
-	return irq;
-}
-
-
-static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
-{
-	struct evtchn_bind_virq bind_virq;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(virq_to_irq, cpu)[virq];
-
-	if (irq == -1) {
-		bind_virq.virq = virq;
-		bind_virq.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
-						&bind_virq) != 0)
-			BUG();
-		evtchn = bind_virq.port;
-
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "virq");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
-
-		per_cpu(virq_to_irq, cpu)[virq] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-
-static void unbind_from_irq(unsigned int irq)
-{
-	struct evtchn_close close;
-	int evtchn = evtchn_from_irq(irq);
-
-	spin_lock(&irq_mapping_update_lock);
-
-	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
-		close.port = evtchn;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)
!= 0)
-			BUG();
-
-		switch (type_from_irq(irq)) {
-		case IRQT_VIRQ:
-			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
-				[index_from_irq(irq)] = -1;
-			break;
-		default:
-			break;
-		}
-
-		/* Closed ports are implicitly re-bound to VCPU0. */
-		bind_evtchn_to_cpu(evtchn, 0);
-
-		evtchn_to_irq[evtchn] = -1;
-		irq_info[irq] = IRQ_UNBOUND;
-
-		dynamic_irq_init(irq);
-	}
-
-	spin_unlock(&irq_mapping_update_lock);
-}
-
-int bind_evtchn_to_irqhandler(unsigned int evtchn,
-			      irq_handler_t handler,
-			      unsigned long irqflags,
-			      const char *devname, void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_evtchn_to_irq(evtchn);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
-
-int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
-			    irq_handler_t handler,
-			    unsigned long irqflags, const char *devname,
void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_virq_to_irq(virq, cpu);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
-
-int bind_ipi_to_irqhandler(enum ipi_vector ipi,
-			   unsigned int cpu,
-			   irq_handler_t handler,
-			   unsigned long irqflags,
-			   const char *devname,
-			   void *dev_id)
-{
-	int irq, retval;
-
-	irq = bind_ipi_to_irq(ipi, cpu);
-	if (irq < 0)
-		return irq;
-
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-
-void unbind_from_irqhandler(unsigned int irq, void *dev_id)
-{
-	free_irq(irq, dev_id);
-	unbind_from_irq(irq);
-}
-EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
-
-void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
-{
-	int irq = per_cpu(ipi_to_irq, cpu)[vector];
-	BUG_ON(irq < 0);
-	notify_remote_via_irq(irq);
-}
-
-
-/*
- * Search the CPUs pending events bitmasks.  For each one found, map
- * the event number to an irq, and feed it into do_IRQ() for
- * handling.
- *
- * Xen uses a two-level bitmap to speed searching.  The first level is
- * a bitset of words which contain pending event bits.  The second
- * level is a bitset of pending events themselves.
- */
-void xen_evtchn_do_upcall(struct pt_regs *regs)
-{
-	int cpu = get_cpu();
-	struct shared_info *s = HYPERVISOR_shared_info;
-	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-	unsigned long pending_words;
-
-	vcpu_info->evtchn_upcall_pending = 0;
-
-	/* NB. No need for a barrier here -- XCHG is a barrier on x86.
*/
-	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
-	while (pending_words != 0) {
-		unsigned long pending_bits;
-		int word_idx = __ffs(pending_words);
-		pending_words &= ~(1UL << word_idx);
-
-		while ((pending_bits = active_evtchns(cpu, s, word_idx))
!= 0) {
-			int bit_idx = __ffs(pending_bits);
-			int port = (word_idx * BITS_PER_LONG) + bit_idx;
-			int irq = evtchn_to_irq[port];
-
-			if (irq != -1) {
-				regs->orig_ax = ~irq;
-				do_IRQ(regs);
-			}
-		}
-	}
-
-	put_cpu();
-}
-
-/* Rebind an evtchn so that it gets delivered to a specific cpu */
-static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
-{
-	struct evtchn_bind_vcpu bind_vcpu;
-	int evtchn = evtchn_from_irq(irq);
-
-	if (!VALID_EVTCHN(evtchn))
-		return;
-
-	/* Send future instances of this interrupt to other vcpu. */
-	bind_vcpu.port = evtchn;
-	bind_vcpu.vcpu = tcpu;
-
-	/*
-	 * If this fails, it usually just indicates that we're dealing
with a
-	 * virq or IPI channel, which don't actually need to be rebound.
Ignore
-	 * it, but don't do the xenlinux-level rebind in that case.
-	 */
-	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>= 0)
-		bind_evtchn_to_cpu(evtchn, tcpu);
-}
-
-
-static void set_affinity_irq(unsigned irq, cpumask_t dest)
-{
-	unsigned tcpu = first_cpu(dest);
-	rebind_irq_to_cpu(irq, tcpu);
-}
-
-static void enable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		unmask_evtchn(evtchn);
-}
-
-static void disable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		mask_evtchn(evtchn);
-}
-
-static void ack_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	move_native_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		clear_evtchn(evtchn);
-}
-
-static int retrigger_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-	int ret = 0;
-
-	if (VALID_EVTCHN(evtchn)) {
-		set_evtchn(evtchn);
-		ret = 1;
-	}
-
-	return ret;
-}
-
-static struct irq_chip xen_dynamic_chip __read_mostly = {
-	.name		= "xen-dyn",
-	.mask		= disable_dynirq,
-	.unmask		= enable_dynirq,
-	.ack		= ack_dynirq,
-	.set_affinity	= set_affinity_irq,
-	.retrigger	= retrigger_dynirq,
-};
-
-void __init xen_init_IRQ(void)
-{
-	int i;
-
-	init_evtchn_cpu_bindings();
-
-	/* No event channels are 'live' right now. */
-	for (i = 0; i < NR_EVENT_CHANNELS; i++)
-		mask_evtchn(i);
-
-	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_bindcount[i] = 0;
-
-	irq_ctx_init(smp_processor_id());
-}
diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile
--- old/arch/x86/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/Makefile	2008-03-25 13:56:41.367764448 +0800
@@ -1,4 +1,4 @@
 obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
-			events.o time.o manage.o xen-asm.o
+			time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h
--- old/arch/x86/xen/xen-ops.h	2008-03-25 13:21:09.996527604 +0800
+++ linux/arch/x86/xen/xen-ops.h	2008-03-25 13:59:16.349809137
+0800
@@ -2,6 +2,7 @@
 #define XEN_OPS_H
 
 #include <linux/init.h>
+#include <xen/xen-ops.h>
 
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_hypervisor_callback[];
@@ -9,7 +10,6 @@
 
 void xen_copy_trap_info(struct trap_info *traps);
 
-DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 DECLARE_PER_CPU(unsigned long, xen_cr3);
 DECLARE_PER_CPU(unsigned long, xen_current_cr3);
 
diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c
--- old/drivers/xen/events.c	1970-01-01 08:00:00.000000000 +0800
+++ linux/drivers/xen/events.c	2008-03-25 13:56:41.368764287 +0800
@@ -0,0 +1,591 @@
+/*
+ * Xen event channels
+ *
+ * Xen models interrupts with abstract event channels.  Because each
+ * domain gets 1024 event channels, but NR_IRQ is not that large, we
+ * must dynamically map irqs<->event channels.  The event channels
+ * interface with the rest of the kernel by defining a xen interrupt
+ * chip.  When an event is recieved, it is mapped to an irq and sent
+ * through the normal interrupt processing path.
+ *
+ * There are four kinds of events which can be mapped to an event
+ * channel:
+ *
+ * 1. Inter-domain notifications.  This includes all the virtual
+ *    device events, since they're driven by front-ends in another
domain
+ *    (typically dom0).
+ * 2. VIRQs, typically used for timers.  These are per-cpu events.
+ * 3. IPIs.
+ * 4. Hardware interrupts. Not supported at present.
+ *
+ * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
+ */
+
+#include <linux/linkage.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <asm/ptrace.h>
+#include <asm/irq.h>
+#include <asm/sync_bitops.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+#include <xen/events.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/event_channel.h>
+
+#include "xen-ops.h"
+
+/*
+ * This lock protects updates to the following mapping and
reference-count
+ * arrays. The lock does not need to be acquired to read the mapping
tables.
+ */
+static DEFINE_SPINLOCK(irq_mapping_update_lock);
+
+/* IRQ <-> VIRQ mapping. */
+static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1]
= -1};
+
+/* IRQ <-> IPI mapping */
+static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ...
XEN_NR_IPIS-1] = -1};
+
+/* Packed IRQ information: binding type, sub-type index, and event
channel. */
+struct packed_irq
+{
+	unsigned short evtchn;
+	unsigned char index;
+	unsigned char type;
+};
+
+static struct packed_irq irq_info[NR_IRQS];
+
+/* Binding types. */
+enum {
+	IRQT_UNBOUND,
+	IRQT_PIRQ,
+	IRQT_VIRQ,
+	IRQT_IPI,
+	IRQT_EVTCHN
+};
+
+/* Convenient shorthand for packed representation of an unbound IRQ. */
+#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
+
+static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
+	[0 ... NR_EVENT_CHANNELS-1] = -1
+};
+static unsigned long
cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
+static u8 cpu_evtchn[NR_EVENT_CHANNELS];
+
+/* Reference counts for bindings to IRQs. */
+static int irq_bindcount[NR_IRQS];
+
+/* Xen will never allocate port zero for any purpose. */
+#define VALID_EVTCHN(chn)	((chn) != 0)
+
+/*
+ * Force a proper event-channel callback from Xen after clearing the
+ * callback mask. We do this in a very simple manner, by making a call
+ * down into Xen. The pending flag will be checked by Xen on return.
+ */
+void force_evtchn_callback(void)
+{
+	(void)HYPERVISOR_xen_version(0, NULL);
+}
+EXPORT_SYMBOL_GPL(force_evtchn_callback);
+
+static struct irq_chip xen_dynamic_chip;
+
+/* Constructor for packed IRQ information. */
+static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32
evtchn)
+{
+	return (struct packed_irq) { evtchn, index, type };
+}
+
+/*
+ * Accessors for packed IRQ information.
+ */
+static inline unsigned int evtchn_from_irq(int irq)
+{
+	return irq_info[irq].evtchn;
+}
+
+static inline unsigned int index_from_irq(int irq)
+{
+	return irq_info[irq].index;
+}
+
+static inline unsigned int type_from_irq(int irq)
+{
+	return irq_info[irq].type;
+}
+
+static inline unsigned long active_evtchns(unsigned int cpu,
+					   struct shared_info *sh,
+					   unsigned int idx)
+{
+	return (sh->evtchn_pending[idx] &
+		cpu_evtchn_mask[cpu][idx] &
+		~sh->evtchn_mask[idx]);
+}
+
+static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
+{
+	int irq = evtchn_to_irq[chn];
+
+	BUG_ON(irq == -1);
+#ifdef CONFIG_SMP
+	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
+#endif
+
+	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
+	__set_bit(chn, cpu_evtchn_mask[cpu]);
+
+	cpu_evtchn[chn] = cpu;
+}
+
+static void init_evtchn_cpu_bindings(void)
+{
+#ifdef CONFIG_SMP
+	int i;
+	/* By default all event channels notify CPU#0. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_desc[i].affinity = cpumask_of_cpu(0);
+#endif
+
+	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
+	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
+}
+
+static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
+{
+	return cpu_evtchn[evtchn];
+}
+
+static inline void clear_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_clear_bit(port, &s->evtchn_pending[0]);
+}
+
+static inline void set_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_pending[0]);
+}
+
+
+/**
+ * notify_remote_via_irq - send event to remote end of event channel
via irq
+ * @irq: irq of event channel to send event to
+ *
+ * Unlike notify_remote_via_evtchn(), this is safe to use across
+ * save/restore. Notifications on a broken connection are silently
+ * dropped.
+ */
+void notify_remote_via_irq(int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		notify_remote_via_evtchn(evtchn);
+}
+EXPORT_SYMBOL_GPL(notify_remote_via_irq);
+
+static void mask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_mask[0]);
+}
+
+static void unmask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	unsigned int cpu = get_cpu();
+
+	BUG_ON(!irqs_disabled());
+
+	/* Slow path (hypercall) if this is a non-local port. */
+	if (unlikely(cpu != cpu_from_evtchn(port))) {
+		struct evtchn_unmask unmask = { .port = port };
+		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask,
&unmask);
+	} else {
+		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+
+		sync_clear_bit(port, &s->evtchn_mask[0]);
+
+		/*
+		 * The following is basically the equivalent of
+		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
+		 * the interrupt edge' if the channel is masked.
+		 */
+		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
+		    !sync_test_and_set_bit(port / BITS_PER_LONG,
+
&vcpu_info->evtchn_pending_sel))
+			vcpu_info->evtchn_upcall_pending = 1;
+	}
+
+	put_cpu();
+}
+
+static int find_unbound_irq(void)
+{
+	int irq;
+
+	/* Only allocate from dynirq range */
+	for (irq = 0; irq < NR_IRQS; irq++)
+		if (irq_bindcount[irq] == 0)
+			break;
+
+	if (irq == NR_IRQS)
+		panic("No available IRQ to bind to: increase
NR_IRQS!\n");
+
+	return irq;
+}
+
+int bind_evtchn_to_irq(unsigned int evtchn)
+{
+	int irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = evtchn_to_irq[evtchn];
+
+	if (irq == -1) {
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq,
"event");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
+
+static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
+{
+	struct evtchn_bind_ipi bind_ipi;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(ipi_to_irq, cpu)[ipi];
+	if (irq == -1) {
+		irq = find_unbound_irq();
+		if (irq < 0)
+			goto out;
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "ipi");
+
+		bind_ipi.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
+						&bind_ipi) != 0)
+			BUG();
+		evtchn = bind_ipi.port;
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
+
+		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+ out:
+	spin_unlock(&irq_mapping_update_lock);
+	return irq;
+}
+
+
+static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
+{
+	struct evtchn_bind_virq bind_virq;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(virq_to_irq, cpu)[virq];
+
+	if (irq == -1) {
+		bind_virq.virq = virq;
+		bind_virq.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+						&bind_virq) != 0)
+			BUG();
+		evtchn = bind_virq.port;
+
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "virq");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
+
+		per_cpu(virq_to_irq, cpu)[virq] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+
+static void unbind_from_irq(unsigned int irq)
+{
+	struct evtchn_close close;
+	int evtchn = evtchn_from_irq(irq);
+
+	spin_lock(&irq_mapping_update_lock);
+
+	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
+		close.port = evtchn;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)
!= 0)
+			BUG();
+
+		switch (type_from_irq(irq)) {
+		case IRQT_VIRQ:
+			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
+				[index_from_irq(irq)] = -1;
+			break;
+		default:
+			break;
+		}
+
+		/* Closed ports are implicitly re-bound to VCPU0. */
+		bind_evtchn_to_cpu(evtchn, 0);
+
+		evtchn_to_irq[evtchn] = -1;
+		irq_info[irq] = IRQ_UNBOUND;
+
+		dynamic_irq_init(irq);
+	}
+
+	spin_unlock(&irq_mapping_update_lock);
+}
+
+int bind_evtchn_to_irqhandler(unsigned int evtchn,
+			      irq_handler_t handler,
+			      unsigned long irqflags,
+			      const char *devname, void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_evtchn_to_irq(evtchn);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
+
+int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
+			    irq_handler_t handler,
+			    unsigned long irqflags, const char *devname,
void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_virq_to_irq(virq, cpu);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
+
+int bind_ipi_to_irqhandler(enum ipi_vector ipi,
+			   unsigned int cpu,
+			   irq_handler_t handler,
+			   unsigned long irqflags,
+			   const char *devname,
+			   void *dev_id)
+{
+	int irq, retval;
+
+	irq = bind_ipi_to_irq(ipi, cpu);
+	if (irq < 0)
+		return irq;
+
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+
+void unbind_from_irqhandler(unsigned int irq, void *dev_id)
+{
+	free_irq(irq, dev_id);
+	unbind_from_irq(irq);
+}
+EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
+
+void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
+{
+	int irq = per_cpu(ipi_to_irq, cpu)[vector];
+	BUG_ON(irq < 0);
+	notify_remote_via_irq(irq);
+}
+
+
+/*
+ * Search the CPUs pending events bitmasks.  For each one found, map
+ * the event number to an irq, and feed it into do_IRQ() for
+ * handling.
+ *
+ * Xen uses a two-level bitmap to speed searching.  The first level is
+ * a bitset of words which contain pending event bits.  The second
+ * level is a bitset of pending events themselves.
+ */
+void xen_evtchn_do_upcall(struct pt_regs *regs)
+{
+	int cpu = get_cpu();
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+	unsigned long pending_words;
+
+	vcpu_info->evtchn_upcall_pending = 0;
+
+	/* NB. No need for a barrier here -- XCHG is a barrier on x86.
*/
+	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
+	while (pending_words != 0) {
+		unsigned long pending_bits;
+		int word_idx = __ffs(pending_words);
+		pending_words &= ~(1UL << word_idx);
+
+		while ((pending_bits = active_evtchns(cpu, s, word_idx))
!= 0) {
+			int bit_idx = __ffs(pending_bits);
+			int port = (word_idx * BITS_PER_LONG) + bit_idx;
+			int irq = evtchn_to_irq[port];
+
+			if (irq != -1) {
+				regs->orig_ax = ~irq;
+				do_IRQ(regs);
+			}
+		}
+	}
+
+	put_cpu();
+}
+
+/* Rebind an evtchn so that it gets delivered to a specific cpu */
+static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+{
+	struct evtchn_bind_vcpu bind_vcpu;
+	int evtchn = evtchn_from_irq(irq);
+
+	if (!VALID_EVTCHN(evtchn))
+		return;
+
+	/* Send future instances of this interrupt to other vcpu. */
+	bind_vcpu.port = evtchn;
+	bind_vcpu.vcpu = tcpu;
+
+	/*
+	 * If this fails, it usually just indicates that we're dealing
with a
+	 * virq or IPI channel, which don't actually need to be rebound.
Ignore
+	 * it, but don't do the xenlinux-level rebind in that case.
+	 */
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>= 0)
+		bind_evtchn_to_cpu(evtchn, tcpu);
+}
+
+
+static void set_affinity_irq(unsigned irq, cpumask_t dest)
+{
+	unsigned tcpu = first_cpu(dest);
+	rebind_irq_to_cpu(irq, tcpu);
+}
+
+static void enable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		unmask_evtchn(evtchn);
+}
+
+static void disable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		mask_evtchn(evtchn);
+}
+
+static void ack_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	move_native_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		clear_evtchn(evtchn);
+}
+
+static int retrigger_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+	int ret = 0;
+
+	if (VALID_EVTCHN(evtchn)) {
+		set_evtchn(evtchn);
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static struct irq_chip xen_dynamic_chip __read_mostly = {
+	.name		= "xen-dyn",
+	.mask		= disable_dynirq,
+	.unmask		= enable_dynirq,
+	.ack		= ack_dynirq,
+	.set_affinity	= set_affinity_irq,
+	.retrigger	= retrigger_dynirq,
+};
+
+void __init xen_init_IRQ(void)
+{
+	int i;
+
+	init_evtchn_cpu_bindings();
+
+	/* No event channels are 'live' right now. */
+	for (i = 0; i < NR_EVENT_CHANNELS; i++)
+		mask_evtchn(i);
+
+	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_bindcount[i] = 0;
+
+	irq_ctx_init(smp_processor_id());
+}
diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile
--- old/drivers/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/drivers/xen/Makefile	2008-03-25 13:56:41.368764287 +0800
@@ -1,2 +1,2 @@
-obj-y	+= grant-table.o
+obj-y	+= grant-table.o events.o
 obj-y	+= xenbus/
diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h
--- old/include/xen/xen-ops.h	1970-01-01 08:00:00.000000000 +0800
+++ linux/include/xen/xen-ops.h	2008-03-25 14:00:09.041321546 +0800
@@ -0,0 +1,6 @@
+#ifndef INCLUDE_XEN_OPS_H
+#define INCLUDE_XEN_OPS_H
+
+DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+#endif /* INCLUDE_XEN_OPS_H */

[-- Attachment #2: move_xenirq3.patch --]
[-- Type: application/octet-stream, Size: 30860 bytes --]

    Move events.c to drivers/xen for IA64/Xen
    support.


    Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>

diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c
--- old/arch/x86/xen/events.c	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/events.c	1970-01-01 08:00:00.000000000 +0800
@@ -1,591 +0,0 @@
-/*
- * Xen event channels
- *
- * Xen models interrupts with abstract event channels.  Because each
- * domain gets 1024 event channels, but NR_IRQ is not that large, we
- * must dynamically map irqs<->event channels.  The event channels
- * interface with the rest of the kernel by defining a xen interrupt
- * chip.  When an event is recieved, it is mapped to an irq and sent
- * through the normal interrupt processing path.
- *
- * There are four kinds of events which can be mapped to an event
- * channel:
- *
- * 1. Inter-domain notifications.  This includes all the virtual
- *    device events, since they're driven by front-ends in another domain
- *    (typically dom0).
- * 2. VIRQs, typically used for timers.  These are per-cpu events.
- * 3. IPIs.
- * 4. Hardware interrupts. Not supported at present.
- *
- * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
- */
-
-#include <linux/linkage.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
-#include <linux/module.h>
-#include <linux/string.h>
-
-#include <asm/ptrace.h>
-#include <asm/irq.h>
-#include <asm/sync_bitops.h>
-#include <asm/xen/hypercall.h>
-#include <asm/xen/hypervisor.h>
-
-#include <xen/events.h>
-#include <xen/interface/xen.h>
-#include <xen/interface/event_channel.h>
-
-#include "xen-ops.h"
-
-/*
- * This lock protects updates to the following mapping and reference-count
- * arrays. The lock does not need to be acquired to read the mapping tables.
- */
-static DEFINE_SPINLOCK(irq_mapping_update_lock);
-
-/* IRQ <-> VIRQ mapping. */
-static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
-
-/* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
-
-/* Packed IRQ information: binding type, sub-type index, and event channel. */
-struct packed_irq
-{
-	unsigned short evtchn;
-	unsigned char index;
-	unsigned char type;
-};
-
-static struct packed_irq irq_info[NR_IRQS];
-
-/* Binding types. */
-enum {
-	IRQT_UNBOUND,
-	IRQT_PIRQ,
-	IRQT_VIRQ,
-	IRQT_IPI,
-	IRQT_EVTCHN
-};
-
-/* Convenient shorthand for packed representation of an unbound IRQ. */
-#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
-
-static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
-	[0 ... NR_EVENT_CHANNELS-1] = -1
-};
-static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
-static u8 cpu_evtchn[NR_EVENT_CHANNELS];
-
-/* Reference counts for bindings to IRQs. */
-static int irq_bindcount[NR_IRQS];
-
-/* Xen will never allocate port zero for any purpose. */
-#define VALID_EVTCHN(chn)	((chn) != 0)
-
-/*
- * Force a proper event-channel callback from Xen after clearing the
- * callback mask. We do this in a very simple manner, by making a call
- * down into Xen. The pending flag will be checked by Xen on return.
- */
-void force_evtchn_callback(void)
-{
-	(void)HYPERVISOR_xen_version(0, NULL);
-}
-EXPORT_SYMBOL_GPL(force_evtchn_callback);
-
-static struct irq_chip xen_dynamic_chip;
-
-/* Constructor for packed IRQ information. */
-static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn)
-{
-	return (struct packed_irq) { evtchn, index, type };
-}
-
-/*
- * Accessors for packed IRQ information.
- */
-static inline unsigned int evtchn_from_irq(int irq)
-{
-	return irq_info[irq].evtchn;
-}
-
-static inline unsigned int index_from_irq(int irq)
-{
-	return irq_info[irq].index;
-}
-
-static inline unsigned int type_from_irq(int irq)
-{
-	return irq_info[irq].type;
-}
-
-static inline unsigned long active_evtchns(unsigned int cpu,
-					   struct shared_info *sh,
-					   unsigned int idx)
-{
-	return (sh->evtchn_pending[idx] &
-		cpu_evtchn_mask[cpu][idx] &
-		~sh->evtchn_mask[idx]);
-}
-
-static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
-{
-	int irq = evtchn_to_irq[chn];
-
-	BUG_ON(irq == -1);
-#ifdef CONFIG_SMP
-	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
-#endif
-
-	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
-	__set_bit(chn, cpu_evtchn_mask[cpu]);
-
-	cpu_evtchn[chn] = cpu;
-}
-
-static void init_evtchn_cpu_bindings(void)
-{
-#ifdef CONFIG_SMP
-	int i;
-	/* By default all event channels notify CPU#0. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_desc[i].affinity = cpumask_of_cpu(0);
-#endif
-
-	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
-	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
-}
-
-static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
-{
-	return cpu_evtchn[evtchn];
-}
-
-static inline void clear_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_clear_bit(port, &s->evtchn_pending[0]);
-}
-
-static inline void set_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_pending[0]);
-}
-
-
-/**
- * notify_remote_via_irq - send event to remote end of event channel via irq
- * @irq: irq of event channel to send event to
- *
- * Unlike notify_remote_via_evtchn(), this is safe to use across
- * save/restore. Notifications on a broken connection are silently
- * dropped.
- */
-void notify_remote_via_irq(int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		notify_remote_via_evtchn(evtchn);
-}
-EXPORT_SYMBOL_GPL(notify_remote_via_irq);
-
-static void mask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_mask[0]);
-}
-
-static void unmask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	unsigned int cpu = get_cpu();
-
-	BUG_ON(!irqs_disabled());
-
-	/* Slow path (hypercall) if this is a non-local port. */
-	if (unlikely(cpu != cpu_from_evtchn(port))) {
-		struct evtchn_unmask unmask = { .port = port };
-		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
-	} else {
-		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-
-		sync_clear_bit(port, &s->evtchn_mask[0]);
-
-		/*
-		 * The following is basically the equivalent of
-		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
-		 * the interrupt edge' if the channel is masked.
-		 */
-		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
-		    !sync_test_and_set_bit(port / BITS_PER_LONG,
-					   &vcpu_info->evtchn_pending_sel))
-			vcpu_info->evtchn_upcall_pending = 1;
-	}
-
-	put_cpu();
-}
-
-static int find_unbound_irq(void)
-{
-	int irq;
-
-	/* Only allocate from dynirq range */
-	for (irq = 0; irq < NR_IRQS; irq++)
-		if (irq_bindcount[irq] == 0)
-			break;
-
-	if (irq == NR_IRQS)
-		panic("No available IRQ to bind to: increase NR_IRQS!\n");
-
-	return irq;
-}
-
-int bind_evtchn_to_irq(unsigned int evtchn)
-{
-	int irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = evtchn_to_irq[evtchn];
-
-	if (irq == -1) {
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "event");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
-
-static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
-{
-	struct evtchn_bind_ipi bind_ipi;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(ipi_to_irq, cpu)[ipi];
-	if (irq == -1) {
-		irq = find_unbound_irq();
-		if (irq < 0)
-			goto out;
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "ipi");
-
-		bind_ipi.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
-						&bind_ipi) != 0)
-			BUG();
-		evtchn = bind_ipi.port;
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
-
-		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
- out:
-	spin_unlock(&irq_mapping_update_lock);
-	return irq;
-}
-
-
-static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
-{
-	struct evtchn_bind_virq bind_virq;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(virq_to_irq, cpu)[virq];
-
-	if (irq == -1) {
-		bind_virq.virq = virq;
-		bind_virq.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
-						&bind_virq) != 0)
-			BUG();
-		evtchn = bind_virq.port;
-
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "virq");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
-
-		per_cpu(virq_to_irq, cpu)[virq] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-
-static void unbind_from_irq(unsigned int irq)
-{
-	struct evtchn_close close;
-	int evtchn = evtchn_from_irq(irq);
-
-	spin_lock(&irq_mapping_update_lock);
-
-	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
-		close.port = evtchn;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
-			BUG();
-
-		switch (type_from_irq(irq)) {
-		case IRQT_VIRQ:
-			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
-				[index_from_irq(irq)] = -1;
-			break;
-		default:
-			break;
-		}
-
-		/* Closed ports are implicitly re-bound to VCPU0. */
-		bind_evtchn_to_cpu(evtchn, 0);
-
-		evtchn_to_irq[evtchn] = -1;
-		irq_info[irq] = IRQ_UNBOUND;
-
-		dynamic_irq_init(irq);
-	}
-
-	spin_unlock(&irq_mapping_update_lock);
-}
-
-int bind_evtchn_to_irqhandler(unsigned int evtchn,
-			      irq_handler_t handler,
-			      unsigned long irqflags,
-			      const char *devname, void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_evtchn_to_irq(evtchn);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
-
-int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
-			    irq_handler_t handler,
-			    unsigned long irqflags, const char *devname, void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_virq_to_irq(virq, cpu);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
-
-int bind_ipi_to_irqhandler(enum ipi_vector ipi,
-			   unsigned int cpu,
-			   irq_handler_t handler,
-			   unsigned long irqflags,
-			   const char *devname,
-			   void *dev_id)
-{
-	int irq, retval;
-
-	irq = bind_ipi_to_irq(ipi, cpu);
-	if (irq < 0)
-		return irq;
-
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-
-void unbind_from_irqhandler(unsigned int irq, void *dev_id)
-{
-	free_irq(irq, dev_id);
-	unbind_from_irq(irq);
-}
-EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
-
-void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
-{
-	int irq = per_cpu(ipi_to_irq, cpu)[vector];
-	BUG_ON(irq < 0);
-	notify_remote_via_irq(irq);
-}
-
-
-/*
- * Search the CPUs pending events bitmasks.  For each one found, map
- * the event number to an irq, and feed it into do_IRQ() for
- * handling.
- *
- * Xen uses a two-level bitmap to speed searching.  The first level is
- * a bitset of words which contain pending event bits.  The second
- * level is a bitset of pending events themselves.
- */
-void xen_evtchn_do_upcall(struct pt_regs *regs)
-{
-	int cpu = get_cpu();
-	struct shared_info *s = HYPERVISOR_shared_info;
-	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-	unsigned long pending_words;
-
-	vcpu_info->evtchn_upcall_pending = 0;
-
-	/* NB. No need for a barrier here -- XCHG is a barrier on x86. */
-	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
-	while (pending_words != 0) {
-		unsigned long pending_bits;
-		int word_idx = __ffs(pending_words);
-		pending_words &= ~(1UL << word_idx);
-
-		while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
-			int bit_idx = __ffs(pending_bits);
-			int port = (word_idx * BITS_PER_LONG) + bit_idx;
-			int irq = evtchn_to_irq[port];
-
-			if (irq != -1) {
-				regs->orig_ax = ~irq;
-				do_IRQ(regs);
-			}
-		}
-	}
-
-	put_cpu();
-}
-
-/* Rebind an evtchn so that it gets delivered to a specific cpu */
-static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
-{
-	struct evtchn_bind_vcpu bind_vcpu;
-	int evtchn = evtchn_from_irq(irq);
-
-	if (!VALID_EVTCHN(evtchn))
-		return;
-
-	/* Send future instances of this interrupt to other vcpu. */
-	bind_vcpu.port = evtchn;
-	bind_vcpu.vcpu = tcpu;
-
-	/*
-	 * If this fails, it usually just indicates that we're dealing with a
-	 * virq or IPI channel, which don't actually need to be rebound. Ignore
-	 * it, but don't do the xenlinux-level rebind in that case.
-	 */
-	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
-		bind_evtchn_to_cpu(evtchn, tcpu);
-}
-
-
-static void set_affinity_irq(unsigned irq, cpumask_t dest)
-{
-	unsigned tcpu = first_cpu(dest);
-	rebind_irq_to_cpu(irq, tcpu);
-}
-
-static void enable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		unmask_evtchn(evtchn);
-}
-
-static void disable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		mask_evtchn(evtchn);
-}
-
-static void ack_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	move_native_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		clear_evtchn(evtchn);
-}
-
-static int retrigger_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-	int ret = 0;
-
-	if (VALID_EVTCHN(evtchn)) {
-		set_evtchn(evtchn);
-		ret = 1;
-	}
-
-	return ret;
-}
-
-static struct irq_chip xen_dynamic_chip __read_mostly = {
-	.name		= "xen-dyn",
-	.mask		= disable_dynirq,
-	.unmask		= enable_dynirq,
-	.ack		= ack_dynirq,
-	.set_affinity	= set_affinity_irq,
-	.retrigger	= retrigger_dynirq,
-};
-
-void __init xen_init_IRQ(void)
-{
-	int i;
-
-	init_evtchn_cpu_bindings();
-
-	/* No event channels are 'live' right now. */
-	for (i = 0; i < NR_EVENT_CHANNELS; i++)
-		mask_evtchn(i);
-
-	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_bindcount[i] = 0;
-
-	irq_ctx_init(smp_processor_id());
-}
diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile
--- old/arch/x86/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/Makefile	2008-03-25 13:56:41.367764448 +0800
@@ -1,4 +1,4 @@
 obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
-			events.o time.o manage.o xen-asm.o
+			time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h
--- old/arch/x86/xen/xen-ops.h	2008-03-25 13:21:09.996527604 +0800
+++ linux/arch/x86/xen/xen-ops.h	2008-03-25 13:59:16.349809137 +0800
@@ -2,6 +2,7 @@
 #define XEN_OPS_H
 
 #include <linux/init.h>
+#include <xen/xen-ops.h>
 
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_hypervisor_callback[];
@@ -9,7 +10,6 @@
 
 void xen_copy_trap_info(struct trap_info *traps);
 
-DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 DECLARE_PER_CPU(unsigned long, xen_cr3);
 DECLARE_PER_CPU(unsigned long, xen_current_cr3);
 
diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c
--- old/drivers/xen/events.c	1970-01-01 08:00:00.000000000 +0800
+++ linux/drivers/xen/events.c	2008-03-25 13:56:41.368764287 +0800
@@ -0,0 +1,591 @@
+/*
+ * Xen event channels
+ *
+ * Xen models interrupts with abstract event channels.  Because each
+ * domain gets 1024 event channels, but NR_IRQ is not that large, we
+ * must dynamically map irqs<->event channels.  The event channels
+ * interface with the rest of the kernel by defining a xen interrupt
+ * chip.  When an event is recieved, it is mapped to an irq and sent
+ * through the normal interrupt processing path.
+ *
+ * There are four kinds of events which can be mapped to an event
+ * channel:
+ *
+ * 1. Inter-domain notifications.  This includes all the virtual
+ *    device events, since they're driven by front-ends in another domain
+ *    (typically dom0).
+ * 2. VIRQs, typically used for timers.  These are per-cpu events.
+ * 3. IPIs.
+ * 4. Hardware interrupts. Not supported at present.
+ *
+ * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
+ */
+
+#include <linux/linkage.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <asm/ptrace.h>
+#include <asm/irq.h>
+#include <asm/sync_bitops.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+#include <xen/events.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/event_channel.h>
+
+#include "xen-ops.h"
+
+/*
+ * This lock protects updates to the following mapping and reference-count
+ * arrays. The lock does not need to be acquired to read the mapping tables.
+ */
+static DEFINE_SPINLOCK(irq_mapping_update_lock);
+
+/* IRQ <-> VIRQ mapping. */
+static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
+
+/* IRQ <-> IPI mapping */
+static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
+
+/* Packed IRQ information: binding type, sub-type index, and event channel. */
+struct packed_irq
+{
+	unsigned short evtchn;
+	unsigned char index;
+	unsigned char type;
+};
+
+static struct packed_irq irq_info[NR_IRQS];
+
+/* Binding types. */
+enum {
+	IRQT_UNBOUND,
+	IRQT_PIRQ,
+	IRQT_VIRQ,
+	IRQT_IPI,
+	IRQT_EVTCHN
+};
+
+/* Convenient shorthand for packed representation of an unbound IRQ. */
+#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
+
+static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
+	[0 ... NR_EVENT_CHANNELS-1] = -1
+};
+static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
+static u8 cpu_evtchn[NR_EVENT_CHANNELS];
+
+/* Reference counts for bindings to IRQs. */
+static int irq_bindcount[NR_IRQS];
+
+/* Xen will never allocate port zero for any purpose. */
+#define VALID_EVTCHN(chn)	((chn) != 0)
+
+/*
+ * Force a proper event-channel callback from Xen after clearing the
+ * callback mask. We do this in a very simple manner, by making a call
+ * down into Xen. The pending flag will be checked by Xen on return.
+ */
+void force_evtchn_callback(void)
+{
+	(void)HYPERVISOR_xen_version(0, NULL);
+}
+EXPORT_SYMBOL_GPL(force_evtchn_callback);
+
+static struct irq_chip xen_dynamic_chip;
+
+/* Constructor for packed IRQ information. */
+static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn)
+{
+	return (struct packed_irq) { evtchn, index, type };
+}
+
+/*
+ * Accessors for packed IRQ information.
+ */
+static inline unsigned int evtchn_from_irq(int irq)
+{
+	return irq_info[irq].evtchn;
+}
+
+static inline unsigned int index_from_irq(int irq)
+{
+	return irq_info[irq].index;
+}
+
+static inline unsigned int type_from_irq(int irq)
+{
+	return irq_info[irq].type;
+}
+
+static inline unsigned long active_evtchns(unsigned int cpu,
+					   struct shared_info *sh,
+					   unsigned int idx)
+{
+	return (sh->evtchn_pending[idx] &
+		cpu_evtchn_mask[cpu][idx] &
+		~sh->evtchn_mask[idx]);
+}
+
+static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
+{
+	int irq = evtchn_to_irq[chn];
+
+	BUG_ON(irq == -1);
+#ifdef CONFIG_SMP
+	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
+#endif
+
+	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
+	__set_bit(chn, cpu_evtchn_mask[cpu]);
+
+	cpu_evtchn[chn] = cpu;
+}
+
+static void init_evtchn_cpu_bindings(void)
+{
+#ifdef CONFIG_SMP
+	int i;
+	/* By default all event channels notify CPU#0. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_desc[i].affinity = cpumask_of_cpu(0);
+#endif
+
+	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
+	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
+}
+
+static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
+{
+	return cpu_evtchn[evtchn];
+}
+
+static inline void clear_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_clear_bit(port, &s->evtchn_pending[0]);
+}
+
+static inline void set_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_pending[0]);
+}
+
+
+/**
+ * notify_remote_via_irq - send event to remote end of event channel via irq
+ * @irq: irq of event channel to send event to
+ *
+ * Unlike notify_remote_via_evtchn(), this is safe to use across
+ * save/restore. Notifications on a broken connection are silently
+ * dropped.
+ */
+void notify_remote_via_irq(int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		notify_remote_via_evtchn(evtchn);
+}
+EXPORT_SYMBOL_GPL(notify_remote_via_irq);
+
+static void mask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_mask[0]);
+}
+
+static void unmask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	unsigned int cpu = get_cpu();
+
+	BUG_ON(!irqs_disabled());
+
+	/* Slow path (hypercall) if this is a non-local port. */
+	if (unlikely(cpu != cpu_from_evtchn(port))) {
+		struct evtchn_unmask unmask = { .port = port };
+		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
+	} else {
+		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+
+		sync_clear_bit(port, &s->evtchn_mask[0]);
+
+		/*
+		 * The following is basically the equivalent of
+		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
+		 * the interrupt edge' if the channel is masked.
+		 */
+		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
+		    !sync_test_and_set_bit(port / BITS_PER_LONG,
+					   &vcpu_info->evtchn_pending_sel))
+			vcpu_info->evtchn_upcall_pending = 1;
+	}
+
+	put_cpu();
+}
+
+static int find_unbound_irq(void)
+{
+	int irq;
+
+	/* Only allocate from dynirq range */
+	for (irq = 0; irq < NR_IRQS; irq++)
+		if (irq_bindcount[irq] == 0)
+			break;
+
+	if (irq == NR_IRQS)
+		panic("No available IRQ to bind to: increase NR_IRQS!\n");
+
+	return irq;
+}
+
+int bind_evtchn_to_irq(unsigned int evtchn)
+{
+	int irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = evtchn_to_irq[evtchn];
+
+	if (irq == -1) {
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "event");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
+
+static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
+{
+	struct evtchn_bind_ipi bind_ipi;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(ipi_to_irq, cpu)[ipi];
+	if (irq == -1) {
+		irq = find_unbound_irq();
+		if (irq < 0)
+			goto out;
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "ipi");
+
+		bind_ipi.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
+						&bind_ipi) != 0)
+			BUG();
+		evtchn = bind_ipi.port;
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
+
+		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+ out:
+	spin_unlock(&irq_mapping_update_lock);
+	return irq;
+}
+
+
+static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
+{
+	struct evtchn_bind_virq bind_virq;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(virq_to_irq, cpu)[virq];
+
+	if (irq == -1) {
+		bind_virq.virq = virq;
+		bind_virq.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+						&bind_virq) != 0)
+			BUG();
+		evtchn = bind_virq.port;
+
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "virq");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
+
+		per_cpu(virq_to_irq, cpu)[virq] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+
+static void unbind_from_irq(unsigned int irq)
+{
+	struct evtchn_close close;
+	int evtchn = evtchn_from_irq(irq);
+
+	spin_lock(&irq_mapping_update_lock);
+
+	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
+		close.port = evtchn;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+			BUG();
+
+		switch (type_from_irq(irq)) {
+		case IRQT_VIRQ:
+			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
+				[index_from_irq(irq)] = -1;
+			break;
+		default:
+			break;
+		}
+
+		/* Closed ports are implicitly re-bound to VCPU0. */
+		bind_evtchn_to_cpu(evtchn, 0);
+
+		evtchn_to_irq[evtchn] = -1;
+		irq_info[irq] = IRQ_UNBOUND;
+
+		dynamic_irq_init(irq);
+	}
+
+	spin_unlock(&irq_mapping_update_lock);
+}
+
+int bind_evtchn_to_irqhandler(unsigned int evtchn,
+			      irq_handler_t handler,
+			      unsigned long irqflags,
+			      const char *devname, void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_evtchn_to_irq(evtchn);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
+
+int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
+			    irq_handler_t handler,
+			    unsigned long irqflags, const char *devname, void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_virq_to_irq(virq, cpu);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
+
+int bind_ipi_to_irqhandler(enum ipi_vector ipi,
+			   unsigned int cpu,
+			   irq_handler_t handler,
+			   unsigned long irqflags,
+			   const char *devname,
+			   void *dev_id)
+{
+	int irq, retval;
+
+	irq = bind_ipi_to_irq(ipi, cpu);
+	if (irq < 0)
+		return irq;
+
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+
+void unbind_from_irqhandler(unsigned int irq, void *dev_id)
+{
+	free_irq(irq, dev_id);
+	unbind_from_irq(irq);
+}
+EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
+
+void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
+{
+	int irq = per_cpu(ipi_to_irq, cpu)[vector];
+	BUG_ON(irq < 0);
+	notify_remote_via_irq(irq);
+}
+
+
+/*
+ * Search the CPUs pending events bitmasks.  For each one found, map
+ * the event number to an irq, and feed it into do_IRQ() for
+ * handling.
+ *
+ * Xen uses a two-level bitmap to speed searching.  The first level is
+ * a bitset of words which contain pending event bits.  The second
+ * level is a bitset of pending events themselves.
+ */
+void xen_evtchn_do_upcall(struct pt_regs *regs)
+{
+	int cpu = get_cpu();
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+	unsigned long pending_words;
+
+	vcpu_info->evtchn_upcall_pending = 0;
+
+	/* NB. No need for a barrier here -- XCHG is a barrier on x86. */
+	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
+	while (pending_words != 0) {
+		unsigned long pending_bits;
+		int word_idx = __ffs(pending_words);
+		pending_words &= ~(1UL << word_idx);
+
+		while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
+			int bit_idx = __ffs(pending_bits);
+			int port = (word_idx * BITS_PER_LONG) + bit_idx;
+			int irq = evtchn_to_irq[port];
+
+			if (irq != -1) {
+				regs->orig_ax = ~irq;
+				do_IRQ(regs);
+			}
+		}
+	}
+
+	put_cpu();
+}
+
+/* Rebind an evtchn so that it gets delivered to a specific cpu */
+static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+{
+	struct evtchn_bind_vcpu bind_vcpu;
+	int evtchn = evtchn_from_irq(irq);
+
+	if (!VALID_EVTCHN(evtchn))
+		return;
+
+	/* Send future instances of this interrupt to other vcpu. */
+	bind_vcpu.port = evtchn;
+	bind_vcpu.vcpu = tcpu;
+
+	/*
+	 * If this fails, it usually just indicates that we're dealing with a
+	 * virq or IPI channel, which don't actually need to be rebound. Ignore
+	 * it, but don't do the xenlinux-level rebind in that case.
+	 */
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
+		bind_evtchn_to_cpu(evtchn, tcpu);
+}
+
+
+static void set_affinity_irq(unsigned irq, cpumask_t dest)
+{
+	unsigned tcpu = first_cpu(dest);
+	rebind_irq_to_cpu(irq, tcpu);
+}
+
+static void enable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		unmask_evtchn(evtchn);
+}
+
+static void disable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		mask_evtchn(evtchn);
+}
+
+static void ack_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	move_native_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		clear_evtchn(evtchn);
+}
+
+static int retrigger_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+	int ret = 0;
+
+	if (VALID_EVTCHN(evtchn)) {
+		set_evtchn(evtchn);
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static struct irq_chip xen_dynamic_chip __read_mostly = {
+	.name		= "xen-dyn",
+	.mask		= disable_dynirq,
+	.unmask		= enable_dynirq,
+	.ack		= ack_dynirq,
+	.set_affinity	= set_affinity_irq,
+	.retrigger	= retrigger_dynirq,
+};
+
+void __init xen_init_IRQ(void)
+{
+	int i;
+
+	init_evtchn_cpu_bindings();
+
+	/* No event channels are 'live' right now. */
+	for (i = 0; i < NR_EVENT_CHANNELS; i++)
+		mask_evtchn(i);
+
+	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_bindcount[i] = 0;
+
+	irq_ctx_init(smp_processor_id());
+}
diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile
--- old/drivers/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/drivers/xen/Makefile	2008-03-25 13:56:41.368764287 +0800
@@ -1,2 +1,2 @@
-obj-y	+= grant-table.o
+obj-y	+= grant-table.o events.o
 obj-y	+= xenbus/
diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h
--- old/include/xen/xen-ops.h	1970-01-01 08:00:00.000000000 +0800
+++ linux/include/xen/xen-ops.h	2008-03-25 14:00:09.041321546 +0800
@@ -0,0 +1,6 @@
+#ifndef INCLUDE_XEN_OPS_H
+#define INCLUDE_XEN_OPS_H
+
+DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+#endif /* INCLUDE_XEN_OPS_H */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Xen common code across architecture
  2008-03-25  6:13       ` Dong, Eddie
@ 2008-03-25  6:35         ` Dong, Eddie
  2008-03-25 15:08         ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 7+ messages in thread
From: Dong, Eddie @ 2008-03-25  6:35 UTC (permalink / raw)
  To: Dong, Eddie, Jeremy Fitzhardinge
  Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64,
	kvm-ia64-devel, xen-ia64-devel

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

Dong, Eddie wrote:
> Jeremy/Andrew:
> 
> 	Isaku Yamahata, I and some other IA64/Xen community memebers are
> 
> working together to enable pv_ops for IA64 Linux. This patch is a
> preparation to
> move common arch/x86/xen/events.c to drivers/xen (contents are
> identical) against
> mm tree, it is based on Yamahata's IA64/pv_ops patch serie.
> 	In case you want to have a brief view of whole pv_ops/IA64 patch
> serie,
> please refer to IA64 Linux mailinglist.
> 
> Thanks, Eddie
> 
> 
     Fix a typo. Merged one is attached too.


    Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>

--- drivers/xen/events_old.c	2008-03-25 14:31:40.503525471 +0800
+++ drivers/xen/events.c	2008-03-25 14:19:39.841851430 +0800
@@ -37,7 +37,7 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
-#include "xen-ops.h"
+#include <xen/xen-ops.h>
 
 /*
  * This lock protects updates to the following mapping and
reference-count

[-- Attachment #2: typo --]
[-- Type: application/octet-stream, Size: 350 bytes --]

--- drivers/xen/events_old.c	2008-03-25 14:31:40.503525471 +0800
+++ drivers/xen/events.c	2008-03-25 14:19:39.841851430 +0800
@@ -37,7 +37,7 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
-#include "xen-ops.h"
+#include <xen/xen-ops.h>
 
 /*
  * This lock protects updates to the following mapping and reference-count

[-- Attachment #3: move_xenirq3.patch --]
[-- Type: application/octet-stream, Size: 30864 bytes --]

    Move events.c to drivers/xen for IA64/Xen
    support.


    Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>

diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c
--- old/arch/x86/xen/events.c	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/events.c	1970-01-01 08:00:00.000000000 +0800
@@ -1,591 +0,0 @@
-/*
- * Xen event channels
- *
- * Xen models interrupts with abstract event channels.  Because each
- * domain gets 1024 event channels, but NR_IRQ is not that large, we
- * must dynamically map irqs<->event channels.  The event channels
- * interface with the rest of the kernel by defining a xen interrupt
- * chip.  When an event is recieved, it is mapped to an irq and sent
- * through the normal interrupt processing path.
- *
- * There are four kinds of events which can be mapped to an event
- * channel:
- *
- * 1. Inter-domain notifications.  This includes all the virtual
- *    device events, since they're driven by front-ends in another domain
- *    (typically dom0).
- * 2. VIRQs, typically used for timers.  These are per-cpu events.
- * 3. IPIs.
- * 4. Hardware interrupts. Not supported at present.
- *
- * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
- */
-
-#include <linux/linkage.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
-#include <linux/module.h>
-#include <linux/string.h>
-
-#include <asm/ptrace.h>
-#include <asm/irq.h>
-#include <asm/sync_bitops.h>
-#include <asm/xen/hypercall.h>
-#include <asm/xen/hypervisor.h>
-
-#include <xen/events.h>
-#include <xen/interface/xen.h>
-#include <xen/interface/event_channel.h>
-
-#include "xen-ops.h"
-
-/*
- * This lock protects updates to the following mapping and reference-count
- * arrays. The lock does not need to be acquired to read the mapping tables.
- */
-static DEFINE_SPINLOCK(irq_mapping_update_lock);
-
-/* IRQ <-> VIRQ mapping. */
-static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
-
-/* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
-
-/* Packed IRQ information: binding type, sub-type index, and event channel. */
-struct packed_irq
-{
-	unsigned short evtchn;
-	unsigned char index;
-	unsigned char type;
-};
-
-static struct packed_irq irq_info[NR_IRQS];
-
-/* Binding types. */
-enum {
-	IRQT_UNBOUND,
-	IRQT_PIRQ,
-	IRQT_VIRQ,
-	IRQT_IPI,
-	IRQT_EVTCHN
-};
-
-/* Convenient shorthand for packed representation of an unbound IRQ. */
-#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
-
-static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
-	[0 ... NR_EVENT_CHANNELS-1] = -1
-};
-static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
-static u8 cpu_evtchn[NR_EVENT_CHANNELS];
-
-/* Reference counts for bindings to IRQs. */
-static int irq_bindcount[NR_IRQS];
-
-/* Xen will never allocate port zero for any purpose. */
-#define VALID_EVTCHN(chn)	((chn) != 0)
-
-/*
- * Force a proper event-channel callback from Xen after clearing the
- * callback mask. We do this in a very simple manner, by making a call
- * down into Xen. The pending flag will be checked by Xen on return.
- */
-void force_evtchn_callback(void)
-{
-	(void)HYPERVISOR_xen_version(0, NULL);
-}
-EXPORT_SYMBOL_GPL(force_evtchn_callback);
-
-static struct irq_chip xen_dynamic_chip;
-
-/* Constructor for packed IRQ information. */
-static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn)
-{
-	return (struct packed_irq) { evtchn, index, type };
-}
-
-/*
- * Accessors for packed IRQ information.
- */
-static inline unsigned int evtchn_from_irq(int irq)
-{
-	return irq_info[irq].evtchn;
-}
-
-static inline unsigned int index_from_irq(int irq)
-{
-	return irq_info[irq].index;
-}
-
-static inline unsigned int type_from_irq(int irq)
-{
-	return irq_info[irq].type;
-}
-
-static inline unsigned long active_evtchns(unsigned int cpu,
-					   struct shared_info *sh,
-					   unsigned int idx)
-{
-	return (sh->evtchn_pending[idx] &
-		cpu_evtchn_mask[cpu][idx] &
-		~sh->evtchn_mask[idx]);
-}
-
-static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
-{
-	int irq = evtchn_to_irq[chn];
-
-	BUG_ON(irq == -1);
-#ifdef CONFIG_SMP
-	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
-#endif
-
-	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
-	__set_bit(chn, cpu_evtchn_mask[cpu]);
-
-	cpu_evtchn[chn] = cpu;
-}
-
-static void init_evtchn_cpu_bindings(void)
-{
-#ifdef CONFIG_SMP
-	int i;
-	/* By default all event channels notify CPU#0. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_desc[i].affinity = cpumask_of_cpu(0);
-#endif
-
-	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
-	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
-}
-
-static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
-{
-	return cpu_evtchn[evtchn];
-}
-
-static inline void clear_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_clear_bit(port, &s->evtchn_pending[0]);
-}
-
-static inline void set_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_pending[0]);
-}
-
-
-/**
- * notify_remote_via_irq - send event to remote end of event channel via irq
- * @irq: irq of event channel to send event to
- *
- * Unlike notify_remote_via_evtchn(), this is safe to use across
- * save/restore. Notifications on a broken connection are silently
- * dropped.
- */
-void notify_remote_via_irq(int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		notify_remote_via_evtchn(evtchn);
-}
-EXPORT_SYMBOL_GPL(notify_remote_via_irq);
-
-static void mask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, &s->evtchn_mask[0]);
-}
-
-static void unmask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	unsigned int cpu = get_cpu();
-
-	BUG_ON(!irqs_disabled());
-
-	/* Slow path (hypercall) if this is a non-local port. */
-	if (unlikely(cpu != cpu_from_evtchn(port))) {
-		struct evtchn_unmask unmask = { .port = port };
-		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
-	} else {
-		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-
-		sync_clear_bit(port, &s->evtchn_mask[0]);
-
-		/*
-		 * The following is basically the equivalent of
-		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
-		 * the interrupt edge' if the channel is masked.
-		 */
-		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
-		    !sync_test_and_set_bit(port / BITS_PER_LONG,
-					   &vcpu_info->evtchn_pending_sel))
-			vcpu_info->evtchn_upcall_pending = 1;
-	}
-
-	put_cpu();
-}
-
-static int find_unbound_irq(void)
-{
-	int irq;
-
-	/* Only allocate from dynirq range */
-	for (irq = 0; irq < NR_IRQS; irq++)
-		if (irq_bindcount[irq] == 0)
-			break;
-
-	if (irq == NR_IRQS)
-		panic("No available IRQ to bind to: increase NR_IRQS!\n");
-
-	return irq;
-}
-
-int bind_evtchn_to_irq(unsigned int evtchn)
-{
-	int irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = evtchn_to_irq[evtchn];
-
-	if (irq == -1) {
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "event");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
-
-static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
-{
-	struct evtchn_bind_ipi bind_ipi;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(ipi_to_irq, cpu)[ipi];
-	if (irq == -1) {
-		irq = find_unbound_irq();
-		if (irq < 0)
-			goto out;
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "ipi");
-
-		bind_ipi.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
-						&bind_ipi) != 0)
-			BUG();
-		evtchn = bind_ipi.port;
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
-
-		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
- out:
-	spin_unlock(&irq_mapping_update_lock);
-	return irq;
-}
-
-
-static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
-{
-	struct evtchn_bind_virq bind_virq;
-	int evtchn, irq;
-
-	spin_lock(&irq_mapping_update_lock);
-
-	irq = per_cpu(virq_to_irq, cpu)[virq];
-
-	if (irq == -1) {
-		bind_virq.virq = virq;
-		bind_virq.vcpu = cpu;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
-						&bind_virq) != 0)
-			BUG();
-		evtchn = bind_virq.port;
-
-		irq = find_unbound_irq();
-
-		dynamic_irq_init(irq);
-		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
-					      handle_level_irq, "virq");
-
-		evtchn_to_irq[evtchn] = irq;
-		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
-
-		per_cpu(virq_to_irq, cpu)[virq] = irq;
-
-		bind_evtchn_to_cpu(evtchn, cpu);
-	}
-
-	irq_bindcount[irq]++;
-
-	spin_unlock(&irq_mapping_update_lock);
-
-	return irq;
-}
-
-static void unbind_from_irq(unsigned int irq)
-{
-	struct evtchn_close close;
-	int evtchn = evtchn_from_irq(irq);
-
-	spin_lock(&irq_mapping_update_lock);
-
-	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
-		close.port = evtchn;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
-			BUG();
-
-		switch (type_from_irq(irq)) {
-		case IRQT_VIRQ:
-			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
-				[index_from_irq(irq)] = -1;
-			break;
-		default:
-			break;
-		}
-
-		/* Closed ports are implicitly re-bound to VCPU0. */
-		bind_evtchn_to_cpu(evtchn, 0);
-
-		evtchn_to_irq[evtchn] = -1;
-		irq_info[irq] = IRQ_UNBOUND;
-
-		dynamic_irq_init(irq);
-	}
-
-	spin_unlock(&irq_mapping_update_lock);
-}
-
-int bind_evtchn_to_irqhandler(unsigned int evtchn,
-			      irq_handler_t handler,
-			      unsigned long irqflags,
-			      const char *devname, void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_evtchn_to_irq(evtchn);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
-
-int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
-			    irq_handler_t handler,
-			    unsigned long irqflags, const char *devname, void *dev_id)
-{
-	unsigned int irq;
-	int retval;
-
-	irq = bind_virq_to_irq(virq, cpu);
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
-
-int bind_ipi_to_irqhandler(enum ipi_vector ipi,
-			   unsigned int cpu,
-			   irq_handler_t handler,
-			   unsigned long irqflags,
-			   const char *devname,
-			   void *dev_id)
-{
-	int irq, retval;
-
-	irq = bind_ipi_to_irq(ipi, cpu);
-	if (irq < 0)
-		return irq;
-
-	retval = request_irq(irq, handler, irqflags, devname, dev_id);
-	if (retval != 0) {
-		unbind_from_irq(irq);
-		return retval;
-	}
-
-	return irq;
-}
-
-void unbind_from_irqhandler(unsigned int irq, void *dev_id)
-{
-	free_irq(irq, dev_id);
-	unbind_from_irq(irq);
-}
-EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
-
-void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
-{
-	int irq = per_cpu(ipi_to_irq, cpu)[vector];
-	BUG_ON(irq < 0);
-	notify_remote_via_irq(irq);
-}
-
-
-/*
- * Search the CPUs pending events bitmasks.  For each one found, map
- * the event number to an irq, and feed it into do_IRQ() for
- * handling.
- *
- * Xen uses a two-level bitmap to speed searching.  The first level is
- * a bitset of words which contain pending event bits.  The second
- * level is a bitset of pending events themselves.
- */
-void xen_evtchn_do_upcall(struct pt_regs *regs)
-{
-	int cpu = get_cpu();
-	struct shared_info *s = HYPERVISOR_shared_info;
-	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
-	unsigned long pending_words;
-
-	vcpu_info->evtchn_upcall_pending = 0;
-
-	/* NB. No need for a barrier here -- XCHG is a barrier on x86. */
-	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
-	while (pending_words != 0) {
-		unsigned long pending_bits;
-		int word_idx = __ffs(pending_words);
-		pending_words &= ~(1UL << word_idx);
-
-		while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
-			int bit_idx = __ffs(pending_bits);
-			int port = (word_idx * BITS_PER_LONG) + bit_idx;
-			int irq = evtchn_to_irq[port];
-
-			if (irq != -1) {
-				regs->orig_ax = ~irq;
-				do_IRQ(regs);
-			}
-		}
-	}
-
-	put_cpu();
-}
-
-/* Rebind an evtchn so that it gets delivered to a specific cpu */
-static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
-{
-	struct evtchn_bind_vcpu bind_vcpu;
-	int evtchn = evtchn_from_irq(irq);
-
-	if (!VALID_EVTCHN(evtchn))
-		return;
-
-	/* Send future instances of this interrupt to other vcpu. */
-	bind_vcpu.port = evtchn;
-	bind_vcpu.vcpu = tcpu;
-
-	/*
-	 * If this fails, it usually just indicates that we're dealing with a
-	 * virq or IPI channel, which don't actually need to be rebound. Ignore
-	 * it, but don't do the xenlinux-level rebind in that case.
-	 */
-	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
-		bind_evtchn_to_cpu(evtchn, tcpu);
-}
-
-
-static void set_affinity_irq(unsigned irq, cpumask_t dest)
-{
-	unsigned tcpu = first_cpu(dest);
-	rebind_irq_to_cpu(irq, tcpu);
-}
-
-static void enable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		unmask_evtchn(evtchn);
-}
-
-static void disable_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		mask_evtchn(evtchn);
-}
-
-static void ack_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-
-	move_native_irq(irq);
-
-	if (VALID_EVTCHN(evtchn))
-		clear_evtchn(evtchn);
-}
-
-static int retrigger_dynirq(unsigned int irq)
-{
-	int evtchn = evtchn_from_irq(irq);
-	int ret = 0;
-
-	if (VALID_EVTCHN(evtchn)) {
-		set_evtchn(evtchn);
-		ret = 1;
-	}
-
-	return ret;
-}
-
-static struct irq_chip xen_dynamic_chip __read_mostly = {
-	.name		= "xen-dyn",
-	.mask		= disable_dynirq,
-	.unmask		= enable_dynirq,
-	.ack		= ack_dynirq,
-	.set_affinity	= set_affinity_irq,
-	.retrigger	= retrigger_dynirq,
-};
-
-void __init xen_init_IRQ(void)
-{
-	int i;
-
-	init_evtchn_cpu_bindings();
-
-	/* No event channels are 'live' right now. */
-	for (i = 0; i < NR_EVENT_CHANNELS; i++)
-		mask_evtchn(i);
-
-	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
-	for (i = 0; i < NR_IRQS; i++)
-		irq_bindcount[i] = 0;
-
-	irq_ctx_init(smp_processor_id());
-}
diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile
--- old/arch/x86/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/arch/x86/xen/Makefile	2008-03-25 13:56:41.367764448 +0800
@@ -1,4 +1,4 @@
 obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
-			events.o time.o manage.o xen-asm.o
+			time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h
--- old/arch/x86/xen/xen-ops.h	2008-03-25 13:21:09.996527604 +0800
+++ linux/arch/x86/xen/xen-ops.h	2008-03-25 13:59:16.349809137 +0800
@@ -2,6 +2,7 @@
 #define XEN_OPS_H
 
 #include <linux/init.h>
+#include <xen/xen-ops.h>
 
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_hypervisor_callback[];
@@ -9,7 +10,6 @@
 
 void xen_copy_trap_info(struct trap_info *traps);
 
-DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 DECLARE_PER_CPU(unsigned long, xen_cr3);
 DECLARE_PER_CPU(unsigned long, xen_current_cr3);
 
diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c
--- old/drivers/xen/events.c	1970-01-01 08:00:00.000000000 +0800
+++ linux/drivers/xen/events.c	2008-03-25 13:56:41.368764287 +0800
@@ -0,0 +1,591 @@
+/*
+ * Xen event channels
+ *
+ * Xen models interrupts with abstract event channels.  Because each
+ * domain gets 1024 event channels, but NR_IRQ is not that large, we
+ * must dynamically map irqs<->event channels.  The event channels
+ * interface with the rest of the kernel by defining a xen interrupt
+ * chip.  When an event is recieved, it is mapped to an irq and sent
+ * through the normal interrupt processing path.
+ *
+ * There are four kinds of events which can be mapped to an event
+ * channel:
+ *
+ * 1. Inter-domain notifications.  This includes all the virtual
+ *    device events, since they're driven by front-ends in another domain
+ *    (typically dom0).
+ * 2. VIRQs, typically used for timers.  These are per-cpu events.
+ * 3. IPIs.
+ * 4. Hardware interrupts. Not supported at present.
+ *
+ * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
+ */
+
+#include <linux/linkage.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <asm/ptrace.h>
+#include <asm/irq.h>
+#include <asm/sync_bitops.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+#include <xen/events.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/event_channel.h>
+
+#include <xen/xen-ops.h>
+
+/*
+ * This lock protects updates to the following mapping and reference-count
+ * arrays. The lock does not need to be acquired to read the mapping tables.
+ */
+static DEFINE_SPINLOCK(irq_mapping_update_lock);
+
+/* IRQ <-> VIRQ mapping. */
+static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
+
+/* IRQ <-> IPI mapping */
+static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
+
+/* Packed IRQ information: binding type, sub-type index, and event channel. */
+struct packed_irq
+{
+	unsigned short evtchn;
+	unsigned char index;
+	unsigned char type;
+};
+
+static struct packed_irq irq_info[NR_IRQS];
+
+/* Binding types. */
+enum {
+	IRQT_UNBOUND,
+	IRQT_PIRQ,
+	IRQT_VIRQ,
+	IRQT_IPI,
+	IRQT_EVTCHN
+};
+
+/* Convenient shorthand for packed representation of an unbound IRQ. */
+#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
+
+static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
+	[0 ... NR_EVENT_CHANNELS-1] = -1
+};
+static unsigned long cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
+static u8 cpu_evtchn[NR_EVENT_CHANNELS];
+
+/* Reference counts for bindings to IRQs. */
+static int irq_bindcount[NR_IRQS];
+
+/* Xen will never allocate port zero for any purpose. */
+#define VALID_EVTCHN(chn)	((chn) != 0)
+
+/*
+ * Force a proper event-channel callback from Xen after clearing the
+ * callback mask. We do this in a very simple manner, by making a call
+ * down into Xen. The pending flag will be checked by Xen on return.
+ */
+void force_evtchn_callback(void)
+{
+	(void)HYPERVISOR_xen_version(0, NULL);
+}
+EXPORT_SYMBOL_GPL(force_evtchn_callback);
+
+static struct irq_chip xen_dynamic_chip;
+
+/* Constructor for packed IRQ information. */
+static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32 evtchn)
+{
+	return (struct packed_irq) { evtchn, index, type };
+}
+
+/*
+ * Accessors for packed IRQ information.
+ */
+static inline unsigned int evtchn_from_irq(int irq)
+{
+	return irq_info[irq].evtchn;
+}
+
+static inline unsigned int index_from_irq(int irq)
+{
+	return irq_info[irq].index;
+}
+
+static inline unsigned int type_from_irq(int irq)
+{
+	return irq_info[irq].type;
+}
+
+static inline unsigned long active_evtchns(unsigned int cpu,
+					   struct shared_info *sh,
+					   unsigned int idx)
+{
+	return (sh->evtchn_pending[idx] &
+		cpu_evtchn_mask[cpu][idx] &
+		~sh->evtchn_mask[idx]);
+}
+
+static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
+{
+	int irq = evtchn_to_irq[chn];
+
+	BUG_ON(irq == -1);
+#ifdef CONFIG_SMP
+	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
+#endif
+
+	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
+	__set_bit(chn, cpu_evtchn_mask[cpu]);
+
+	cpu_evtchn[chn] = cpu;
+}
+
+static void init_evtchn_cpu_bindings(void)
+{
+#ifdef CONFIG_SMP
+	int i;
+	/* By default all event channels notify CPU#0. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_desc[i].affinity = cpumask_of_cpu(0);
+#endif
+
+	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
+	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
+}
+
+static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
+{
+	return cpu_evtchn[evtchn];
+}
+
+static inline void clear_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_clear_bit(port, &s->evtchn_pending[0]);
+}
+
+static inline void set_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_pending[0]);
+}
+
+
+/**
+ * notify_remote_via_irq - send event to remote end of event channel via irq
+ * @irq: irq of event channel to send event to
+ *
+ * Unlike notify_remote_via_evtchn(), this is safe to use across
+ * save/restore. Notifications on a broken connection are silently
+ * dropped.
+ */
+void notify_remote_via_irq(int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		notify_remote_via_evtchn(evtchn);
+}
+EXPORT_SYMBOL_GPL(notify_remote_via_irq);
+
+static void mask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, &s->evtchn_mask[0]);
+}
+
+static void unmask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	unsigned int cpu = get_cpu();
+
+	BUG_ON(!irqs_disabled());
+
+	/* Slow path (hypercall) if this is a non-local port. */
+	if (unlikely(cpu != cpu_from_evtchn(port))) {
+		struct evtchn_unmask unmask = { .port = port };
+		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
+	} else {
+		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+
+		sync_clear_bit(port, &s->evtchn_mask[0]);
+
+		/*
+		 * The following is basically the equivalent of
+		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
+		 * the interrupt edge' if the channel is masked.
+		 */
+		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
+		    !sync_test_and_set_bit(port / BITS_PER_LONG,
+					   &vcpu_info->evtchn_pending_sel))
+			vcpu_info->evtchn_upcall_pending = 1;
+	}
+
+	put_cpu();
+}
+
+static int find_unbound_irq(void)
+{
+	int irq;
+
+	/* Only allocate from dynirq range */
+	for (irq = 0; irq < NR_IRQS; irq++)
+		if (irq_bindcount[irq] == 0)
+			break;
+
+	if (irq == NR_IRQS)
+		panic("No available IRQ to bind to: increase NR_IRQS!\n");
+
+	return irq;
+}
+
+int bind_evtchn_to_irq(unsigned int evtchn)
+{
+	int irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = evtchn_to_irq[evtchn];
+
+	if (irq == -1) {
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "event");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
+
+static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
+{
+	struct evtchn_bind_ipi bind_ipi;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(ipi_to_irq, cpu)[ipi];
+	if (irq == -1) {
+		irq = find_unbound_irq();
+		if (irq < 0)
+			goto out;
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "ipi");
+
+		bind_ipi.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
+						&bind_ipi) != 0)
+			BUG();
+		evtchn = bind_ipi.port;
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
+
+		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+ out:
+	spin_unlock(&irq_mapping_update_lock);
+	return irq;
+}
+
+
+static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
+{
+	struct evtchn_bind_virq bind_virq;
+	int evtchn, irq;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = per_cpu(virq_to_irq, cpu)[virq];
+
+	if (irq == -1) {
+		bind_virq.virq = virq;
+		bind_virq.vcpu = cpu;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+						&bind_virq) != 0)
+			BUG();
+		evtchn = bind_virq.port;
+
+		irq = find_unbound_irq();
+
+		dynamic_irq_init(irq);
+		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
+					      handle_level_irq, "virq");
+
+		evtchn_to_irq[evtchn] = irq;
+		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
+
+		per_cpu(virq_to_irq, cpu)[virq] = irq;
+
+		bind_evtchn_to_cpu(evtchn, cpu);
+	}
+
+	irq_bindcount[irq]++;
+
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+
+static void unbind_from_irq(unsigned int irq)
+{
+	struct evtchn_close close;
+	int evtchn = evtchn_from_irq(irq);
+
+	spin_lock(&irq_mapping_update_lock);
+
+	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
+		close.port = evtchn;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+			BUG();
+
+		switch (type_from_irq(irq)) {
+		case IRQT_VIRQ:
+			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
+				[index_from_irq(irq)] = -1;
+			break;
+		default:
+			break;
+		}
+
+		/* Closed ports are implicitly re-bound to VCPU0. */
+		bind_evtchn_to_cpu(evtchn, 0);
+
+		evtchn_to_irq[evtchn] = -1;
+		irq_info[irq] = IRQ_UNBOUND;
+
+		dynamic_irq_init(irq);
+	}
+
+	spin_unlock(&irq_mapping_update_lock);
+}
+
+int bind_evtchn_to_irqhandler(unsigned int evtchn,
+			      irq_handler_t handler,
+			      unsigned long irqflags,
+			      const char *devname, void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_evtchn_to_irq(evtchn);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
+
+int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
+			    irq_handler_t handler,
+			    unsigned long irqflags, const char *devname, void *dev_id)
+{
+	unsigned int irq;
+	int retval;
+
+	irq = bind_virq_to_irq(virq, cpu);
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
+
+int bind_ipi_to_irqhandler(enum ipi_vector ipi,
+			   unsigned int cpu,
+			   irq_handler_t handler,
+			   unsigned long irqflags,
+			   const char *devname,
+			   void *dev_id)
+{
+	int irq, retval;
+
+	irq = bind_ipi_to_irq(ipi, cpu);
+	if (irq < 0)
+		return irq;
+
+	retval = request_irq(irq, handler, irqflags, devname, dev_id);
+	if (retval != 0) {
+		unbind_from_irq(irq);
+		return retval;
+	}
+
+	return irq;
+}
+
+void unbind_from_irqhandler(unsigned int irq, void *dev_id)
+{
+	free_irq(irq, dev_id);
+	unbind_from_irq(irq);
+}
+EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
+
+void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
+{
+	int irq = per_cpu(ipi_to_irq, cpu)[vector];
+	BUG_ON(irq < 0);
+	notify_remote_via_irq(irq);
+}
+
+
+/*
+ * Search the CPUs pending events bitmasks.  For each one found, map
+ * the event number to an irq, and feed it into do_IRQ() for
+ * handling.
+ *
+ * Xen uses a two-level bitmap to speed searching.  The first level is
+ * a bitset of words which contain pending event bits.  The second
+ * level is a bitset of pending events themselves.
+ */
+void xen_evtchn_do_upcall(struct pt_regs *regs)
+{
+	int cpu = get_cpu();
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
+	unsigned long pending_words;
+
+	vcpu_info->evtchn_upcall_pending = 0;
+
+	/* NB. No need for a barrier here -- XCHG is a barrier on x86. */
+	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
+	while (pending_words != 0) {
+		unsigned long pending_bits;
+		int word_idx = __ffs(pending_words);
+		pending_words &= ~(1UL << word_idx);
+
+		while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
+			int bit_idx = __ffs(pending_bits);
+			int port = (word_idx * BITS_PER_LONG) + bit_idx;
+			int irq = evtchn_to_irq[port];
+
+			if (irq != -1) {
+				regs->orig_ax = ~irq;
+				do_IRQ(regs);
+			}
+		}
+	}
+
+	put_cpu();
+}
+
+/* Rebind an evtchn so that it gets delivered to a specific cpu */
+static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+{
+	struct evtchn_bind_vcpu bind_vcpu;
+	int evtchn = evtchn_from_irq(irq);
+
+	if (!VALID_EVTCHN(evtchn))
+		return;
+
+	/* Send future instances of this interrupt to other vcpu. */
+	bind_vcpu.port = evtchn;
+	bind_vcpu.vcpu = tcpu;
+
+	/*
+	 * If this fails, it usually just indicates that we're dealing with a
+	 * virq or IPI channel, which don't actually need to be rebound. Ignore
+	 * it, but don't do the xenlinux-level rebind in that case.
+	 */
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
+		bind_evtchn_to_cpu(evtchn, tcpu);
+}
+
+
+static void set_affinity_irq(unsigned irq, cpumask_t dest)
+{
+	unsigned tcpu = first_cpu(dest);
+	rebind_irq_to_cpu(irq, tcpu);
+}
+
+static void enable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		unmask_evtchn(evtchn);
+}
+
+static void disable_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		mask_evtchn(evtchn);
+}
+
+static void ack_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	move_native_irq(irq);
+
+	if (VALID_EVTCHN(evtchn))
+		clear_evtchn(evtchn);
+}
+
+static int retrigger_dynirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+	int ret = 0;
+
+	if (VALID_EVTCHN(evtchn)) {
+		set_evtchn(evtchn);
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static struct irq_chip xen_dynamic_chip __read_mostly = {
+	.name		= "xen-dyn",
+	.mask		= disable_dynirq,
+	.unmask		= enable_dynirq,
+	.ack		= ack_dynirq,
+	.set_affinity	= set_affinity_irq,
+	.retrigger	= retrigger_dynirq,
+};
+
+void __init xen_init_IRQ(void)
+{
+	int i;
+
+	init_evtchn_cpu_bindings();
+
+	/* No event channels are 'live' right now. */
+	for (i = 0; i < NR_EVENT_CHANNELS; i++)
+		mask_evtchn(i);
+
+	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
+	for (i = 0; i < NR_IRQS; i++)
+		irq_bindcount[i] = 0;
+
+	irq_ctx_init(smp_processor_id());
+}
diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile
--- old/drivers/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
+++ linux/drivers/xen/Makefile	2008-03-25 13:56:41.368764287 +0800
@@ -1,2 +1,2 @@
-obj-y	+= grant-table.o
+obj-y	+= grant-table.o events.o
 obj-y	+= xenbus/
diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h
--- old/include/xen/xen-ops.h	1970-01-01 08:00:00.000000000 +0800
+++ linux/include/xen/xen-ops.h	2008-03-25 14:00:09.041321546 +0800
@@ -0,0 +1,6 @@
+#ifndef INCLUDE_XEN_OPS_H
+#define INCLUDE_XEN_OPS_H
+
+DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+#endif /* INCLUDE_XEN_OPS_H */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Xen common code across architecture
  2008-03-25  6:13       ` Dong, Eddie
  2008-03-25  6:35         ` Dong, Eddie
@ 2008-03-25 15:08         ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2008-03-25 15:08 UTC (permalink / raw)
  To: Dong, Eddie
  Cc: virtualization, linux-kernel, Andrew Morton, linux-ia64,
	kvm-ia64-devel, xen-ia64-devel

Dong, Eddie wrote:
> Jeremy/Andrew:
>
> 	Isaku Yamahata, I and some other IA64/Xen community memebers are
>
> working together to enable pv_ops for IA64 Linux. This patch is a
> preparation to 
> move common arch/x86/xen/events.c to drivers/xen (contents are
> identical) against
> mm tree, it is based on Yamahata's IA64/pv_ops patch serie.
> 	In case you want to have a brief view of whole pv_ops/IA64 patch
> serie, 
> please refer to IA64 Linux mailinglist.
>   

How do you want to manage this work?  I'm currently basing off 
Ingo+tglx's x86.git tree.  Would you like me to track these kinds of 
common-code changes in my tree, while you maintain a separate 
ia64-specific tree?

> Thanks, Eddie
> 	
>
>     Move events.c to drivers/xen for IA64/Xen
>     support.
>   

Looks reasonable.  One comment below.

>     Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
>
> diff -urN old/arch/x86/xen/events.c linux/arch/x86/xen/events.c
> --- old/arch/x86/xen/events.c	2008-03-10 13:22:27.000000000 +0800
> +++ linux/arch/x86/xen/events.c	1970-01-01 08:00:00.000000000 +0800
> @@ -1,591 +0,0 @@
> -/*
> - * Xen event channels
> - *
> - * Xen models interrupts with abstract event channels.  Because each
> - * domain gets 1024 event channels, but NR_IRQ is not that large, we
> - * must dynamically map irqs<->event channels.  The event channels
> - * interface with the rest of the kernel by defining a xen interrupt
> - * chip.  When an event is recieved, it is mapped to an irq and sent
> - * through the normal interrupt processing path.
> - *
> - * There are four kinds of events which can be mapped to an event
> - * channel:
> - *
> - * 1. Inter-domain notifications.  This includes all the virtual
> - *    device events, since they're driven by front-ends in another
> domain
> - *    (typically dom0).
> - * 2. VIRQs, typically used for timers.  These are per-cpu events.
> - * 3. IPIs.
> - * 4. Hardware interrupts. Not supported at present.
> - *
> - * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
> - */
> -
> -#include <linux/linkage.h>
> -#include <linux/interrupt.h>
> -#include <linux/irq.h>
> -#include <linux/module.h>
> -#include <linux/string.h>
> -
> -#include <asm/ptrace.h>
> -#include <asm/irq.h>
> -#include <asm/sync_bitops.h>
> -#include <asm/xen/hypercall.h>
> -#include <asm/xen/hypervisor.h>
> -
> -#include <xen/events.h>
> -#include <xen/interface/xen.h>
> -#include <xen/interface/event_channel.h>
> -
> -#include "xen-ops.h"
> -
> -/*
> - * This lock protects updates to the following mapping and
> reference-count
> - * arrays. The lock does not need to be acquired to read the mapping
> tables.
> - */
> -static DEFINE_SPINLOCK(irq_mapping_update_lock);
> -
> -/* IRQ <-> VIRQ mapping. */
> -static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1]
> = -1};
> -
> -/* IRQ <-> IPI mapping */
> -static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ...
> XEN_NR_IPIS-1] = -1};
> -
> -/* Packed IRQ information: binding type, sub-type index, and event
> channel. */
> -struct packed_irq
> -{
> -	unsigned short evtchn;
> -	unsigned char index;
> -	unsigned char type;
> -};
> -
> -static struct packed_irq irq_info[NR_IRQS];
> -
> -/* Binding types. */
> -enum {
> -	IRQT_UNBOUND,
> -	IRQT_PIRQ,
> -	IRQT_VIRQ,
> -	IRQT_IPI,
> -	IRQT_EVTCHN
> -};
> -
> -/* Convenient shorthand for packed representation of an unbound IRQ. */
> -#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
> -
> -static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
> -	[0 ... NR_EVENT_CHANNELS-1] = -1
> -};
> -static unsigned long
> cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
> -static u8 cpu_evtchn[NR_EVENT_CHANNELS];
> -
> -/* Reference counts for bindings to IRQs. */
> -static int irq_bindcount[NR_IRQS];
> -
> -/* Xen will never allocate port zero for any purpose. */
> -#define VALID_EVTCHN(chn)	((chn) != 0)
> -
> -/*
> - * Force a proper event-channel callback from Xen after clearing the
> - * callback mask. We do this in a very simple manner, by making a call
> - * down into Xen. The pending flag will be checked by Xen on return.
> - */
> -void force_evtchn_callback(void)
> -{
> -	(void)HYPERVISOR_xen_version(0, NULL);
> -}
> -EXPORT_SYMBOL_GPL(force_evtchn_callback);
> -
> -static struct irq_chip xen_dynamic_chip;
> -
> -/* Constructor for packed IRQ information. */
> -static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32
> evtchn)
> -{
> -	return (struct packed_irq) { evtchn, index, type };
> -}
> -
> -/*
> - * Accessors for packed IRQ information.
> - */
> -static inline unsigned int evtchn_from_irq(int irq)
> -{
> -	return irq_info[irq].evtchn;
> -}
> -
> -static inline unsigned int index_from_irq(int irq)
> -{
> -	return irq_info[irq].index;
> -}
> -
> -static inline unsigned int type_from_irq(int irq)
> -{
> -	return irq_info[irq].type;
> -}
> -
> -static inline unsigned long active_evtchns(unsigned int cpu,
> -					   struct shared_info *sh,
> -					   unsigned int idx)
> -{
> -	return (sh->evtchn_pending[idx] &
> -		cpu_evtchn_mask[cpu][idx] &
> -		~sh->evtchn_mask[idx]);
> -}
> -
> -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
> -{
> -	int irq = evtchn_to_irq[chn];
> -
> -	BUG_ON(irq = -1);
> -#ifdef CONFIG_SMP
> -	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
> -#endif
> -
> -	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
> -	__set_bit(chn, cpu_evtchn_mask[cpu]);
> -
> -	cpu_evtchn[chn] = cpu;
> -}
> -
> -static void init_evtchn_cpu_bindings(void)
> -{
> -#ifdef CONFIG_SMP
> -	int i;
> -	/* By default all event channels notify CPU#0. */
> -	for (i = 0; i < NR_IRQS; i++)
> -		irq_desc[i].affinity = cpumask_of_cpu(0);
> -#endif
> -
> -	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
> -	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
> -}
> -
> -static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
> -{
> -	return cpu_evtchn[evtchn];
> -}
> -
> -static inline void clear_evtchn(int port)
> -{
> -	struct shared_info *s = HYPERVISOR_shared_info;
> -	sync_clear_bit(port, &s->evtchn_pending[0]);
> -}
> -
> -static inline void set_evtchn(int port)
> -{
> -	struct shared_info *s = HYPERVISOR_shared_info;
> -	sync_set_bit(port, &s->evtchn_pending[0]);
> -}
> -
> -
> -/**
> - * notify_remote_via_irq - send event to remote end of event channel
> via irq
> - * @irq: irq of event channel to send event to
> - *
> - * Unlike notify_remote_via_evtchn(), this is safe to use across
> - * save/restore. Notifications on a broken connection are silently
> - * dropped.
> - */
> -void notify_remote_via_irq(int irq)
> -{
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	if (VALID_EVTCHN(evtchn))
> -		notify_remote_via_evtchn(evtchn);
> -}
> -EXPORT_SYMBOL_GPL(notify_remote_via_irq);
> -
> -static void mask_evtchn(int port)
> -{
> -	struct shared_info *s = HYPERVISOR_shared_info;
> -	sync_set_bit(port, &s->evtchn_mask[0]);
> -}
> -
> -static void unmask_evtchn(int port)
> -{
> -	struct shared_info *s = HYPERVISOR_shared_info;
> -	unsigned int cpu = get_cpu();
> -
> -	BUG_ON(!irqs_disabled());
> -
> -	/* Slow path (hypercall) if this is a non-local port. */
> -	if (unlikely(cpu != cpu_from_evtchn(port))) {
> -		struct evtchn_unmask unmask = { .port = port };
> -		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask,
> &unmask);
> -	} else {
> -		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> -
> -		sync_clear_bit(port, &s->evtchn_mask[0]);
> -
> -		/*
> -		 * The following is basically the equivalent of
> -		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
> -		 * the interrupt edge' if the channel is masked.
> -		 */
> -		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
> -		    !sync_test_and_set_bit(port / BITS_PER_LONG,
> -
> &vcpu_info->evtchn_pending_sel))
> -			vcpu_info->evtchn_upcall_pending = 1;
> -	}
> -
> -	put_cpu();
> -}
> -
> -static int find_unbound_irq(void)
> -{
> -	int irq;
> -
> -	/* Only allocate from dynirq range */
> -	for (irq = 0; irq < NR_IRQS; irq++)
> -		if (irq_bindcount[irq] = 0)
> -			break;
> -
> -	if (irq = NR_IRQS)
> -		panic("No available IRQ to bind to: increase
> NR_IRQS!\n");
> -
> -	return irq;
> -}
> -
> -int bind_evtchn_to_irq(unsigned int evtchn)
> -{
> -	int irq;
> -
> -	spin_lock(&irq_mapping_update_lock);
> -
> -	irq = evtchn_to_irq[evtchn];
> -
> -	if (irq = -1) {
> -		irq = find_unbound_irq();
> -
> -		dynamic_irq_init(irq);
> -		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> -					      handle_level_irq,
> "event");
> -
> -		evtchn_to_irq[evtchn] = irq;
> -		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
> -	}
> -
> -	irq_bindcount[irq]++;
> -
> -	spin_unlock(&irq_mapping_update_lock);
> -
> -	return irq;
> -}
> -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
> -
> -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
> -{
> -	struct evtchn_bind_ipi bind_ipi;
> -	int evtchn, irq;
> -
> -	spin_lock(&irq_mapping_update_lock);
> -
> -	irq = per_cpu(ipi_to_irq, cpu)[ipi];
> -	if (irq = -1) {
> -		irq = find_unbound_irq();
> -		if (irq < 0)
> -			goto out;
> -
> -		dynamic_irq_init(irq);
> -		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> -					      handle_level_irq, "ipi");
> -
> -		bind_ipi.vcpu = cpu;
> -		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
> -						&bind_ipi) != 0)
> -			BUG();
> -		evtchn = bind_ipi.port;
> -
> -		evtchn_to_irq[evtchn] = irq;
> -		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
> -
> -		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
> -
> -		bind_evtchn_to_cpu(evtchn, cpu);
> -	}
> -
> -	irq_bindcount[irq]++;
> -
> - out:
> -	spin_unlock(&irq_mapping_update_lock);
> -	return irq;
> -}
> -
> -
> -static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
> -{
> -	struct evtchn_bind_virq bind_virq;
> -	int evtchn, irq;
> -
> -	spin_lock(&irq_mapping_update_lock);
> -
> -	irq = per_cpu(virq_to_irq, cpu)[virq];
> -
> -	if (irq = -1) {
> -		bind_virq.virq = virq;
> -		bind_virq.vcpu = cpu;
> -		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
> -						&bind_virq) != 0)
> -			BUG();
> -		evtchn = bind_virq.port;
> -
> -		irq = find_unbound_irq();
> -
> -		dynamic_irq_init(irq);
> -		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> -					      handle_level_irq, "virq");
> -
> -		evtchn_to_irq[evtchn] = irq;
> -		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
> -
> -		per_cpu(virq_to_irq, cpu)[virq] = irq;
> -
> -		bind_evtchn_to_cpu(evtchn, cpu);
> -	}
> -
> -	irq_bindcount[irq]++;
> -
> -	spin_unlock(&irq_mapping_update_lock);
> -
> -	return irq;
> -}
> -
> -static void unbind_from_irq(unsigned int irq)
> -{
> -	struct evtchn_close close;
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	spin_lock(&irq_mapping_update_lock);
> -
> -	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] = 0)) {
> -		close.port = evtchn;
> -		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)
> != 0)
> -			BUG();
> -
> -		switch (type_from_irq(irq)) {
> -		case IRQT_VIRQ:
> -			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
> -				[index_from_irq(irq)] = -1;
> -			break;
> -		default:
> -			break;
> -		}
> -
> -		/* Closed ports are implicitly re-bound to VCPU0. */
> -		bind_evtchn_to_cpu(evtchn, 0);
> -
> -		evtchn_to_irq[evtchn] = -1;
> -		irq_info[irq] = IRQ_UNBOUND;
> -
> -		dynamic_irq_init(irq);
> -	}
> -
> -	spin_unlock(&irq_mapping_update_lock);
> -}
> -
> -int bind_evtchn_to_irqhandler(unsigned int evtchn,
> -			      irq_handler_t handler,
> -			      unsigned long irqflags,
> -			      const char *devname, void *dev_id)
> -{
> -	unsigned int irq;
> -	int retval;
> -
> -	irq = bind_evtchn_to_irq(evtchn);
> -	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> -	if (retval != 0) {
> -		unbind_from_irq(irq);
> -		return retval;
> -	}
> -
> -	return irq;
> -}
> -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
> -
> -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
> -			    irq_handler_t handler,
> -			    unsigned long irqflags, const char *devname,
> void *dev_id)
> -{
> -	unsigned int irq;
> -	int retval;
> -
> -	irq = bind_virq_to_irq(virq, cpu);
> -	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> -	if (retval != 0) {
> -		unbind_from_irq(irq);
> -		return retval;
> -	}
> -
> -	return irq;
> -}
> -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
> -
> -int bind_ipi_to_irqhandler(enum ipi_vector ipi,
> -			   unsigned int cpu,
> -			   irq_handler_t handler,
> -			   unsigned long irqflags,
> -			   const char *devname,
> -			   void *dev_id)
> -{
> -	int irq, retval;
> -
> -	irq = bind_ipi_to_irq(ipi, cpu);
> -	if (irq < 0)
> -		return irq;
> -
> -	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> -	if (retval != 0) {
> -		unbind_from_irq(irq);
> -		return retval;
> -	}
> -
> -	return irq;
> -}
> -
> -void unbind_from_irqhandler(unsigned int irq, void *dev_id)
> -{
> -	free_irq(irq, dev_id);
> -	unbind_from_irq(irq);
> -}
> -EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
> -
> -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
> -{
> -	int irq = per_cpu(ipi_to_irq, cpu)[vector];
> -	BUG_ON(irq < 0);
> -	notify_remote_via_irq(irq);
> -}
> -
> -
> -/*
> - * Search the CPUs pending events bitmasks.  For each one found, map
> - * the event number to an irq, and feed it into do_IRQ() for
> - * handling.
> - *
> - * Xen uses a two-level bitmap to speed searching.  The first level is
> - * a bitset of words which contain pending event bits.  The second
> - * level is a bitset of pending events themselves.
> - */
> -void xen_evtchn_do_upcall(struct pt_regs *regs)
> -{
> -	int cpu = get_cpu();
> -	struct shared_info *s = HYPERVISOR_shared_info;
> -	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> -	unsigned long pending_words;
> -
> -	vcpu_info->evtchn_upcall_pending = 0;
> -
> -	/* NB. No need for a barrier here -- XCHG is a barrier on x86.
> */
> -	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
> -	while (pending_words != 0) {
> -		unsigned long pending_bits;
> -		int word_idx = __ffs(pending_words);
> -		pending_words &= ~(1UL << word_idx);
> -
> -		while ((pending_bits = active_evtchns(cpu, s, word_idx))
> != 0) {
> -			int bit_idx = __ffs(pending_bits);
> -			int port = (word_idx * BITS_PER_LONG) + bit_idx;
> -			int irq = evtchn_to_irq[port];
> -
> -			if (irq != -1) {
> -				regs->orig_ax = ~irq;
> -				do_IRQ(regs);
> -			}
> -		}
> -	}
> -
> -	put_cpu();
> -}
> -
> -/* Rebind an evtchn so that it gets delivered to a specific cpu */
> -static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
> -{
> -	struct evtchn_bind_vcpu bind_vcpu;
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	if (!VALID_EVTCHN(evtchn))
> -		return;
> -
> -	/* Send future instances of this interrupt to other vcpu. */
> -	bind_vcpu.port = evtchn;
> -	bind_vcpu.vcpu = tcpu;
> -
> -	/*
> -	 * If this fails, it usually just indicates that we're dealing
> with a
> -	 * virq or IPI channel, which don't actually need to be rebound.
> Ignore
> -	 * it, but don't do the xenlinux-level rebind in that case.
> -	 */
> -	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>   
>> = 0)
>>     
> -		bind_evtchn_to_cpu(evtchn, tcpu);
> -}
> -
> -
> -static void set_affinity_irq(unsigned irq, cpumask_t dest)
> -{
> -	unsigned tcpu = first_cpu(dest);
> -	rebind_irq_to_cpu(irq, tcpu);
> -}
> -
> -static void enable_dynirq(unsigned int irq)
> -{
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	if (VALID_EVTCHN(evtchn))
> -		unmask_evtchn(evtchn);
> -}
> -
> -static void disable_dynirq(unsigned int irq)
> -{
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	if (VALID_EVTCHN(evtchn))
> -		mask_evtchn(evtchn);
> -}
> -
> -static void ack_dynirq(unsigned int irq)
> -{
> -	int evtchn = evtchn_from_irq(irq);
> -
> -	move_native_irq(irq);
> -
> -	if (VALID_EVTCHN(evtchn))
> -		clear_evtchn(evtchn);
> -}
> -
> -static int retrigger_dynirq(unsigned int irq)
> -{
> -	int evtchn = evtchn_from_irq(irq);
> -	int ret = 0;
> -
> -	if (VALID_EVTCHN(evtchn)) {
> -		set_evtchn(evtchn);
> -		ret = 1;
> -	}
> -
> -	return ret;
> -}
> -
> -static struct irq_chip xen_dynamic_chip __read_mostly = {
> -	.name		= "xen-dyn",
> -	.mask		= disable_dynirq,
> -	.unmask		= enable_dynirq,
> -	.ack		= ack_dynirq,
> -	.set_affinity	= set_affinity_irq,
> -	.retrigger	= retrigger_dynirq,
> -};
> -
> -void __init xen_init_IRQ(void)
> -{
> -	int i;
> -
> -	init_evtchn_cpu_bindings();
> -
> -	/* No event channels are 'live' right now. */
> -	for (i = 0; i < NR_EVENT_CHANNELS; i++)
> -		mask_evtchn(i);
> -
> -	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
> -	for (i = 0; i < NR_IRQS; i++)
> -		irq_bindcount[i] = 0;
> -
> -	irq_ctx_init(smp_processor_id());
> -}
> diff -urN old/arch/x86/xen/Makefile linux/arch/x86/xen/Makefile
> --- old/arch/x86/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
> +++ linux/arch/x86/xen/Makefile	2008-03-25 13:56:41.367764448 +0800
> @@ -1,4 +1,4 @@
>  obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
> -			events.o time.o manage.o xen-asm.o
> +			time.o manage.o xen-asm.o
>  
>  obj-$(CONFIG_SMP)	+= smp.o
> diff -urN old/arch/x86/xen/xen-ops.h linux/arch/x86/xen/xen-ops.h
> --- old/arch/x86/xen/xen-ops.h	2008-03-25 13:21:09.996527604 +0800
> +++ linux/arch/x86/xen/xen-ops.h	2008-03-25 13:59:16.349809137
> +0800
> @@ -2,6 +2,7 @@
>  #define XEN_OPS_H
>  
>  #include <linux/init.h>
> +#include <xen/xen-ops.h>
>  
>  /* These are code, but not functions.  Defined in entry.S */
>  extern const char xen_hypervisor_callback[];
> @@ -9,7 +10,6 @@
>  
>  void xen_copy_trap_info(struct trap_info *traps);
>  
> -DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
>  DECLARE_PER_CPU(unsigned long, xen_cr3);
>  DECLARE_PER_CPU(unsigned long, xen_current_cr3);
>  
> diff -urN old/drivers/xen/events.c linux/drivers/xen/events.c
> --- old/drivers/xen/events.c	1970-01-01 08:00:00.000000000 +0800
> +++ linux/drivers/xen/events.c	2008-03-25 13:56:41.368764287 +0800
> @@ -0,0 +1,591 @@
> +/*
> + * Xen event channels
> + *
> + * Xen models interrupts with abstract event channels.  Because each
> + * domain gets 1024 event channels, but NR_IRQ is not that large, we
> + * must dynamically map irqs<->event channels.  The event channels
> + * interface with the rest of the kernel by defining a xen interrupt
> + * chip.  When an event is recieved, it is mapped to an irq and sent
> + * through the normal interrupt processing path.
> + *
> + * There are four kinds of events which can be mapped to an event
> + * channel:
> + *
> + * 1. Inter-domain notifications.  This includes all the virtual
> + *    device events, since they're driven by front-ends in another
> domain
> + *    (typically dom0).
> + * 2. VIRQs, typically used for timers.  These are per-cpu events.
> + * 3. IPIs.
> + * 4. Hardware interrupts. Not supported at present.
> + *
> + * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
> + */
> +
> +#include <linux/linkage.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/module.h>
> +#include <linux/string.h>
> +
> +#include <asm/ptrace.h>
> +#include <asm/irq.h>
> +#include <asm/sync_bitops.h>
> +#include <asm/xen/hypercall.h>
> +#include <asm/xen/hypervisor.h>
> +
> +#include <xen/events.h>
> +#include <xen/interface/xen.h>
> +#include <xen/interface/event_channel.h>
> +
> +#include "xen-ops.h"
> +
> +/*
> + * This lock protects updates to the following mapping and
> reference-count
> + * arrays. The lock does not need to be acquired to read the mapping
> tables.
> + */
> +static DEFINE_SPINLOCK(irq_mapping_update_lock);
> +
> +/* IRQ <-> VIRQ mapping. */
> +static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1]
> = -1};
> +
> +/* IRQ <-> IPI mapping */
> +static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ...
> XEN_NR_IPIS-1] = -1};
> +
> +/* Packed IRQ information: binding type, sub-type index, and event
> channel. */
> +struct packed_irq
> +{
> +	unsigned short evtchn;
> +	unsigned char index;
> +	unsigned char type;
> +};
> +
> +static struct packed_irq irq_info[NR_IRQS];
> +
> +/* Binding types. */
> +enum {
> +	IRQT_UNBOUND,
> +	IRQT_PIRQ,
> +	IRQT_VIRQ,
> +	IRQT_IPI,
> +	IRQT_EVTCHN
> +};
> +
> +/* Convenient shorthand for packed representation of an unbound IRQ. */
> +#define IRQ_UNBOUND	mk_irq_info(IRQT_UNBOUND, 0, 0)
> +
> +static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
> +	[0 ... NR_EVENT_CHANNELS-1] = -1
> +};
> +static unsigned long
> cpu_evtchn_mask[NR_CPUS][NR_EVENT_CHANNELS/BITS_PER_LONG];
> +static u8 cpu_evtchn[NR_EVENT_CHANNELS];
> +
> +/* Reference counts for bindings to IRQs. */
> +static int irq_bindcount[NR_IRQS];
> +
> +/* Xen will never allocate port zero for any purpose. */
> +#define VALID_EVTCHN(chn)	((chn) != 0)
> +
> +/*
> + * Force a proper event-channel callback from Xen after clearing the
> + * callback mask. We do this in a very simple manner, by making a call
> + * down into Xen. The pending flag will be checked by Xen on return.
> + */
> +void force_evtchn_callback(void)
> +{
> +	(void)HYPERVISOR_xen_version(0, NULL);
> +}
> +EXPORT_SYMBOL_GPL(force_evtchn_callback);
> +
> +static struct irq_chip xen_dynamic_chip;
> +
> +/* Constructor for packed IRQ information. */
> +static inline struct packed_irq mk_irq_info(u32 type, u32 index, u32
> evtchn)
> +{
> +	return (struct packed_irq) { evtchn, index, type };
> +}
> +
> +/*
> + * Accessors for packed IRQ information.
> + */
> +static inline unsigned int evtchn_from_irq(int irq)
> +{
> +	return irq_info[irq].evtchn;
> +}
> +
> +static inline unsigned int index_from_irq(int irq)
> +{
> +	return irq_info[irq].index;
> +}
> +
> +static inline unsigned int type_from_irq(int irq)
> +{
> +	return irq_info[irq].type;
> +}
> +
> +static inline unsigned long active_evtchns(unsigned int cpu,
> +					   struct shared_info *sh,
> +					   unsigned int idx)
> +{
> +	return (sh->evtchn_pending[idx] &
> +		cpu_evtchn_mask[cpu][idx] &
> +		~sh->evtchn_mask[idx]);
> +}
> +
> +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
> +{
> +	int irq = evtchn_to_irq[chn];
> +
> +	BUG_ON(irq = -1);
> +#ifdef CONFIG_SMP
> +	irq_desc[irq].affinity = cpumask_of_cpu(cpu);
> +#endif
> +
> +	__clear_bit(chn, cpu_evtchn_mask[cpu_evtchn[chn]]);
> +	__set_bit(chn, cpu_evtchn_mask[cpu]);
> +
> +	cpu_evtchn[chn] = cpu;
> +}
> +
> +static void init_evtchn_cpu_bindings(void)
> +{
> +#ifdef CONFIG_SMP
> +	int i;
> +	/* By default all event channels notify CPU#0. */
> +	for (i = 0; i < NR_IRQS; i++)
> +		irq_desc[i].affinity = cpumask_of_cpu(0);
> +#endif
> +
> +	memset(cpu_evtchn, 0, sizeof(cpu_evtchn));
> +	memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0]));
> +}
> +
> +static inline unsigned int cpu_from_evtchn(unsigned int evtchn)
> +{
> +	return cpu_evtchn[evtchn];
> +}
> +
> +static inline void clear_evtchn(int port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	sync_clear_bit(port, &s->evtchn_pending[0]);
> +}
> +
> +static inline void set_evtchn(int port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	sync_set_bit(port, &s->evtchn_pending[0]);
> +}
> +
> +
> +/**
> + * notify_remote_via_irq - send event to remote end of event channel
> via irq
> + * @irq: irq of event channel to send event to
> + *
> + * Unlike notify_remote_via_evtchn(), this is safe to use across
> + * save/restore. Notifications on a broken connection are silently
> + * dropped.
> + */
> +void notify_remote_via_irq(int irq)
> +{
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	if (VALID_EVTCHN(evtchn))
> +		notify_remote_via_evtchn(evtchn);
> +}
> +EXPORT_SYMBOL_GPL(notify_remote_via_irq);
> +
> +static void mask_evtchn(int port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	sync_set_bit(port, &s->evtchn_mask[0]);
> +}
> +
> +static void unmask_evtchn(int port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	unsigned int cpu = get_cpu();
> +
> +	BUG_ON(!irqs_disabled());
> +
> +	/* Slow path (hypercall) if this is a non-local port. */
> +	if (unlikely(cpu != cpu_from_evtchn(port))) {
> +		struct evtchn_unmask unmask = { .port = port };
> +		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask,
> &unmask);
> +	} else {
> +		struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> +
> +		sync_clear_bit(port, &s->evtchn_mask[0]);
> +
> +		/*
> +		 * The following is basically the equivalent of
> +		 * 'hw_resend_irq'. Just like a real IO-APIC we 'lose
> +		 * the interrupt edge' if the channel is masked.
> +		 */
> +		if (sync_test_bit(port, &s->evtchn_pending[0]) &&
> +		    !sync_test_and_set_bit(port / BITS_PER_LONG,
> +
> &vcpu_info->evtchn_pending_sel))
> +			vcpu_info->evtchn_upcall_pending = 1;
> +	}
> +
> +	put_cpu();
> +}
> +
> +static int find_unbound_irq(void)
> +{
> +	int irq;
> +
> +	/* Only allocate from dynirq range */
> +	for (irq = 0; irq < NR_IRQS; irq++)
> +		if (irq_bindcount[irq] = 0)
> +			break;
> +
> +	if (irq = NR_IRQS)
> +		panic("No available IRQ to bind to: increase
> NR_IRQS!\n");
> +
> +	return irq;
> +}
> +
> +int bind_evtchn_to_irq(unsigned int evtchn)
> +{
> +	int irq;
> +
> +	spin_lock(&irq_mapping_update_lock);
> +
> +	irq = evtchn_to_irq[evtchn];
> +
> +	if (irq = -1) {
> +		irq = find_unbound_irq();
> +
> +		dynamic_irq_init(irq);
> +		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> +					      handle_level_irq,
> "event");
> +
> +		evtchn_to_irq[evtchn] = irq;
> +		irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
> +	}
> +
> +	irq_bindcount[irq]++;
> +
> +	spin_unlock(&irq_mapping_update_lock);
> +
> +	return irq;
> +}
> +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
> +
> +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
> +{
> +	struct evtchn_bind_ipi bind_ipi;
> +	int evtchn, irq;
> +
> +	spin_lock(&irq_mapping_update_lock);
> +
> +	irq = per_cpu(ipi_to_irq, cpu)[ipi];
> +	if (irq = -1) {
> +		irq = find_unbound_irq();
> +		if (irq < 0)
> +			goto out;
> +
> +		dynamic_irq_init(irq);
> +		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> +					      handle_level_irq, "ipi");
> +
> +		bind_ipi.vcpu = cpu;
> +		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
> +						&bind_ipi) != 0)
> +			BUG();
> +		evtchn = bind_ipi.port;
> +
> +		evtchn_to_irq[evtchn] = irq;
> +		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
> +
> +		per_cpu(ipi_to_irq, cpu)[ipi] = irq;
> +
> +		bind_evtchn_to_cpu(evtchn, cpu);
> +	}
> +
> +	irq_bindcount[irq]++;
> +
> + out:
> +	spin_unlock(&irq_mapping_update_lock);
> +	return irq;
> +}
> +
> +
> +static int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
> +{
> +	struct evtchn_bind_virq bind_virq;
> +	int evtchn, irq;
> +
> +	spin_lock(&irq_mapping_update_lock);
> +
> +	irq = per_cpu(virq_to_irq, cpu)[virq];
> +
> +	if (irq = -1) {
> +		bind_virq.virq = virq;
> +		bind_virq.vcpu = cpu;
> +		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
> +						&bind_virq) != 0)
> +			BUG();
> +		evtchn = bind_virq.port;
> +
> +		irq = find_unbound_irq();
> +
> +		dynamic_irq_init(irq);
> +		set_irq_chip_and_handler_name(irq, &xen_dynamic_chip,
> +					      handle_level_irq, "virq");
> +
> +		evtchn_to_irq[evtchn] = irq;
> +		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
> +
> +		per_cpu(virq_to_irq, cpu)[virq] = irq;
> +
> +		bind_evtchn_to_cpu(evtchn, cpu);
> +	}
> +
> +	irq_bindcount[irq]++;
> +
> +	spin_unlock(&irq_mapping_update_lock);
> +
> +	return irq;
> +}
> +
> +static void unbind_from_irq(unsigned int irq)
> +{
> +	struct evtchn_close close;
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	spin_lock(&irq_mapping_update_lock);
> +
> +	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] = 0)) {
> +		close.port = evtchn;
> +		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)
> != 0)
> +			BUG();
> +
> +		switch (type_from_irq(irq)) {
> +		case IRQT_VIRQ:
> +			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
> +				[index_from_irq(irq)] = -1;
> +			break;
> +		default:
> +			break;
> +		}
> +
> +		/* Closed ports are implicitly re-bound to VCPU0. */
> +		bind_evtchn_to_cpu(evtchn, 0);
> +
> +		evtchn_to_irq[evtchn] = -1;
> +		irq_info[irq] = IRQ_UNBOUND;
> +
> +		dynamic_irq_init(irq);
> +	}
> +
> +	spin_unlock(&irq_mapping_update_lock);
> +}
> +
> +int bind_evtchn_to_irqhandler(unsigned int evtchn,
> +			      irq_handler_t handler,
> +			      unsigned long irqflags,
> +			      const char *devname, void *dev_id)
> +{
> +	unsigned int irq;
> +	int retval;
> +
> +	irq = bind_evtchn_to_irq(evtchn);
> +	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> +	if (retval != 0) {
> +		unbind_from_irq(irq);
> +		return retval;
> +	}
> +
> +	return irq;
> +}
> +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler);
> +
> +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu,
> +			    irq_handler_t handler,
> +			    unsigned long irqflags, const char *devname,
> void *dev_id)
> +{
> +	unsigned int irq;
> +	int retval;
> +
> +	irq = bind_virq_to_irq(virq, cpu);
> +	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> +	if (retval != 0) {
> +		unbind_from_irq(irq);
> +		return retval;
> +	}
> +
> +	return irq;
> +}
> +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler);
> +
> +int bind_ipi_to_irqhandler(enum ipi_vector ipi,
> +			   unsigned int cpu,
> +			   irq_handler_t handler,
> +			   unsigned long irqflags,
> +			   const char *devname,
> +			   void *dev_id)
> +{
> +	int irq, retval;
> +
> +	irq = bind_ipi_to_irq(ipi, cpu);
> +	if (irq < 0)
> +		return irq;
> +
> +	retval = request_irq(irq, handler, irqflags, devname, dev_id);
> +	if (retval != 0) {
> +		unbind_from_irq(irq);
> +		return retval;
> +	}
> +
> +	return irq;
> +}
> +
> +void unbind_from_irqhandler(unsigned int irq, void *dev_id)
> +{
> +	free_irq(irq, dev_id);
> +	unbind_from_irq(irq);
> +}
> +EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
> +
> +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
> +{
> +	int irq = per_cpu(ipi_to_irq, cpu)[vector];
> +	BUG_ON(irq < 0);
> +	notify_remote_via_irq(irq);
> +}
> +
> +
> +/*
> + * Search the CPUs pending events bitmasks.  For each one found, map
> + * the event number to an irq, and feed it into do_IRQ() for
> + * handling.
> + *
> + * Xen uses a two-level bitmap to speed searching.  The first level is
> + * a bitset of words which contain pending event bits.  The second
> + * level is a bitset of pending events themselves.
> + */
> +void xen_evtchn_do_upcall(struct pt_regs *regs)
> +{
> +	int cpu = get_cpu();
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> +	unsigned long pending_words;
> +
> +	vcpu_info->evtchn_upcall_pending = 0;
> +
> +	/* NB. No need for a barrier here -- XCHG is a barrier on x86.
> */
> +	pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
> +	while (pending_words != 0) {
> +		unsigned long pending_bits;
> +		int word_idx = __ffs(pending_words);
> +		pending_words &= ~(1UL << word_idx);
> +
> +		while ((pending_bits = active_evtchns(cpu, s, word_idx))
> != 0) {
> +			int bit_idx = __ffs(pending_bits);
> +			int port = (word_idx * BITS_PER_LONG) + bit_idx;
> +			int irq = evtchn_to_irq[port];
> +
> +			if (irq != -1) {
> +				regs->orig_ax = ~irq;
> +				do_IRQ(regs);
> +			}
> +		}
> +	}
> +
> +	put_cpu();
> +}
> +
> +/* Rebind an evtchn so that it gets delivered to a specific cpu */
> +static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
> +{
> +	struct evtchn_bind_vcpu bind_vcpu;
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	if (!VALID_EVTCHN(evtchn))
> +		return;
> +
> +	/* Send future instances of this interrupt to other vcpu. */
> +	bind_vcpu.port = evtchn;
> +	bind_vcpu.vcpu = tcpu;
> +
> +	/*
> +	 * If this fails, it usually just indicates that we're dealing
> with a
> +	 * virq or IPI channel, which don't actually need to be rebound.
> Ignore
> +	 * it, but don't do the xenlinux-level rebind in that case.
> +	 */
> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>   
>> = 0)
>>     
> +		bind_evtchn_to_cpu(evtchn, tcpu);
> +}
> +
> +
> +static void set_affinity_irq(unsigned irq, cpumask_t dest)
> +{
> +	unsigned tcpu = first_cpu(dest);
> +	rebind_irq_to_cpu(irq, tcpu);
> +}
> +
> +static void enable_dynirq(unsigned int irq)
> +{
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	if (VALID_EVTCHN(evtchn))
> +		unmask_evtchn(evtchn);
> +}
> +
> +static void disable_dynirq(unsigned int irq)
> +{
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	if (VALID_EVTCHN(evtchn))
> +		mask_evtchn(evtchn);
> +}
> +
> +static void ack_dynirq(unsigned int irq)
> +{
> +	int evtchn = evtchn_from_irq(irq);
> +
> +	move_native_irq(irq);
> +
> +	if (VALID_EVTCHN(evtchn))
> +		clear_evtchn(evtchn);
> +}
> +
> +static int retrigger_dynirq(unsigned int irq)
> +{
> +	int evtchn = evtchn_from_irq(irq);
> +	int ret = 0;
> +
> +	if (VALID_EVTCHN(evtchn)) {
> +		set_evtchn(evtchn);
> +		ret = 1;
> +	}
> +
> +	return ret;
> +}
> +
> +static struct irq_chip xen_dynamic_chip __read_mostly = {
> +	.name		= "xen-dyn",
> +	.mask		= disable_dynirq,
> +	.unmask		= enable_dynirq,
> +	.ack		= ack_dynirq,
> +	.set_affinity	= set_affinity_irq,
> +	.retrigger	= retrigger_dynirq,
> +};
> +
> +void __init xen_init_IRQ(void)
> +{
> +	int i;
> +
> +	init_evtchn_cpu_bindings();
> +
> +	/* No event channels are 'live' right now. */
> +	for (i = 0; i < NR_EVENT_CHANNELS; i++)
> +		mask_evtchn(i);
> +
> +	/* Dynamic IRQ space is currently unbound. Zero the refcnts. */
> +	for (i = 0; i < NR_IRQS; i++)
> +		irq_bindcount[i] = 0;
> +
> +	irq_ctx_init(smp_processor_id());
> +}
> diff -urN old/drivers/xen/Makefile linux/drivers/xen/Makefile
> --- old/drivers/xen/Makefile	2008-03-10 13:22:27.000000000 +0800
> +++ linux/drivers/xen/Makefile	2008-03-25 13:56:41.368764287 +0800
> @@ -1,2 +1,2 @@
> -obj-y	+= grant-table.o
> +obj-y	+= grant-table.o events.o
>  obj-y	+= xenbus/
> diff -urN old/include/xen/xen-ops.h linux/include/xen/xen-ops.h
> --- old/include/xen/xen-ops.h	1970-01-01 08:00:00.000000000 +0800
> +++ linux/include/xen/xen-ops.h	2008-03-25 14:00:09.041321546 +0800
> @@ -0,0 +1,6 @@
> +#ifndef INCLUDE_XEN_OPS_H
> +#define INCLUDE_XEN_OPS_H
> +
> +DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
>   

This should include <linux/percpu.h> rather than assuming it has already 
been included.

    J

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-03-25 15:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-05 18:18 [PATCH 00/50] ia64/xen take 3: ia64/xen domU paravirtualization Isaku Yamahata
2008-03-05 18:19 ` [PATCH 50/50] ia64/pv_ops/xen: define xen pv_irq_ops Isaku Yamahata
2008-03-20  9:13   ` Xen common code across architecture Dong, Eddie
2008-03-20 14:23     ` Jeremy Fitzhardinge
2008-03-25  6:13       ` Dong, Eddie
2008-03-25  6:35         ` Dong, Eddie
2008-03-25 15:08         ` Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox