* [PATCH 1/2] Add Hypertransport capability defines. @ 2006-07-10 22:14 Eric W. Biederman 2006-07-10 22:26 ` [PATCH 2/2] Initial generic hypertransport interrupt support Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-10 22:14 UTC (permalink / raw) To: Andrew Morton; +Cc: Dave Olson, linux-kernel This adds defines for the hypertransport capability subtypes and starts using them a little. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- This patch is against 2.6.18-rc1-mm1 arch/powerpc/sysdev/mpic.c | 2 +- include/linux/pci_regs.h | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c index 9961eda..9be8c49 100644 --- a/arch/powerpc/sysdev/mpic.c +++ b/arch/powerpc/sysdev/mpic.c @@ -251,7 +251,7 @@ static void __init mpic_scan_ht_pic(stru u8 id = readb(devbase + pos + PCI_CAP_LIST_ID); if (id == PCI_CAP_ID_HT) { id = readb(devbase + pos + 3); - if (id == 0x80) + if (id == HT_CAPTYPE_IRQ) break; } } diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h index 05b893e..0ea629c 100644 --- a/include/linux/pci_regs.h +++ b/include/linux/pci_regs.h @@ -12,6 +12,11 @@ * PCI Local Bus Specification * PCI to PCI Bridge Specification * PCI System Design Guide + * + * For hypertransport information, please consult the following manuals + * from http://www.hypertranposrt.org + * + * The Hypertransport I/O Link Specification */ #ifndef LINUX_PCI_REGS_H @@ -447,4 +452,20 @@ #define PCI_PWR_DATA_RAIL(x) (((x) >> 1 #define PCI_PWR_CAP 12 /* Capability */ #define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */ +/* Hypertransport sub capability types */ +#define HT_CAPTYPE_SLAVE 0x00 /* Slave/Primary link configuration */ +#define HT_CAPTYPE_HOST 0x20 /* Host/Secondary link configuration */ +#define HT_CAPTYPE_IRQ 0x80 /* IRQ Configuration */ +#define HT_CAPTYPE_REMAPPING_40 0xA0 /* 40 bit address remapping */ +#define HT_CAPTYPE_REMAPPING_64 0xA2 /* 64 bit address remapping */ +#define HT_CAPTYPE_UNITID_CLUMP 0x90 /* Unit ID clumping */ +#define HT_CAPTYPE_EXTCONF 0x98 /* Extended Configuration Space Access */ +#define HT_CAPTYPE_MSI_MAPPING 0xA8 /* MSI Mapping Capability */ +#define HT_CAPTYPE_DIRECT_ROUTE 0xB0 /* Direct routing configuration */ +#define HT_CAPTYPE_VCSET 0xB8 /* Virtual Channel configuration */ +#define HT_CAPTYPE_ERROR_RETRY 0xC0 /* Retry on error configuration */ +#define HT_CAPTYPE_GEN3 0xD0 /* Generation 3 hypertransport configuration */ +#define HT_CAPTYPE_PM 0xE0 /* Hypertransport powermanagement configuration */ + + #endif /* LINUX_PCI_REGS_H */ -- 1.4.1.gac83a ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-10 22:14 [PATCH 1/2] Add Hypertransport capability defines Eric W. Biederman @ 2006-07-10 22:26 ` Eric W. Biederman 2006-07-10 22:39 ` Benjamin Herrenschmidt 2006-07-11 22:27 ` Andi Kleen 0 siblings, 2 replies; 22+ messages in thread From: Eric W. Biederman @ 2006-07-10 22:26 UTC (permalink / raw) To: Andrew Morton; +Cc: Dave Olson, linux-kernel, Benjamin Herrenschmidt This patch implements two functions ht_create_irq and ht_destroy_irq for use by drivers. Several other functions are implemented as helpers for arch specific irq_chip handlers. The driver for the card I tested this on isn't yet ready to be merged. However this code is and hypertransport irqs are in use in a few other places in the kernel. Not that any of this will get merged before 2.6.19 Because the ipath-ht400 is slightly out of spec this code will need to be generalized to work there. I think all of the powerpc uses are for a plain interrupt controller in a chipset so support for native hypertransport devices is a little less interesting. However I think this is a half way decent model on how to separate arch specific and generic helper code, and I think this is a functional model of how to get the architecture dependencies out of the msi code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- arch/i386/kernel/io_apic.c | 90 +++++++++++++++++ arch/x86_64/kernel/io_apic.c | 96 ++++++++++++++++++ drivers/pci/Kconfig | 10 ++ drivers/pci/Makefile | 1 drivers/pci/htirq.c | 190 +++++++++++++++++++++++++++++++++++ include/asm-i386/hypertransport.h | 42 ++++++++ include/asm-x86_64/hypertransport.h | 42 ++++++++ include/linux/pci.h | 17 +++ 8 files changed, 488 insertions(+), 0 deletions(-) diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c index 70b5962..168e407 100644 --- a/arch/i386/kernel/io_apic.c +++ b/arch/i386/kernel/io_apic.c @@ -40,6 +40,7 @@ #include <asm/timer.h> #include <asm/i8259.h> #include <asm/nmi.h> #include <asm/msidef.h> +#include <asm/hypertransport.h> #include <mach_apic.h> #include <mach_apicdef.h> @@ -2532,6 +2533,95 @@ struct msi_ops arch_msi_ops = { #endif /* CONFIG_PCI_MSI */ +/* + * Hypertransport interrupt support + */ +#ifdef CONFIG_HT_IRQ + +#ifdef CONFIG_SMP + +static void target_ht_irq(unsigned int irq, unsigned int dest) +{ + u32 low, high; + low = read_ht_irq_low(irq); + high = read_ht_irq_high(irq); + + low &= ~(HT_IRQ_LOW_DEST_ID_MASK); + high &= ~(HT_IRQ_HIGH_DEST_ID_MASK); + + low |= HT_IRQ_LOW_DEST_ID(dest); + high |= HT_IRQ_HIGH_DEST_ID(dest); + + write_ht_irq_low(irq, low); + write_ht_irq_high(irq, high); +} + +static void set_ht_irq_affinity(unsigned int irq, cpumask_t mask) +{ + unsigned int dest; + cpumask_t tmp; + + cpus_and(tmp, mask, cpu_online_map); + if (cpus_empty(tmp)) + tmp = TARGET_CPUS; + + cpus_and(mask, tmp, CPU_MASK_ALL); + + dest = cpu_mask_to_apicid(mask); + + target_ht_irq(irq, dest); + set_native_irq_info(irq, mask); +} +#endif + +static struct hw_interrupt_type ht_irq_chip = { + .name = "PCI-HT", + .mask = mask_ht_irq, + .unmask = unmask_ht_irq, + .ack = ack_ioapic_irq, +#ifdef CONFIG_SMP + .set_affinity = set_ht_irq_affinity, +#endif + .retrigger = ioapic_retrigger_irq, +}; + +int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev) +{ + int vector; + + vector = assign_irq_vector(irq); + if (vector >= 0) { + u32 low, high; + unsigned dest; + cpumask_t tmp; + + cpus_clear(tmp); + cpu_set(vector >> 8, tmp); + dest = cpu_mask_to_apicid(tmp); + + high = HT_IRQ_HIGH_DEST_ID(dest); + + low = HT_IRQ_LOW_BASE | + HT_IRQ_LOW_DEST_ID(dest) | + HT_IRQ_LOW_VECTOR(vector) | + ((INT_DEST_MODE == 0) ? + HT_IRQ_LOW_DM_PHYSICAL : + HT_IRQ_LOW_DM_LOGICAL) | + HT_IRQ_LOW_RQEOI_EDGE | + ((INT_DELIVERY_MODE != dest_LowestPrio) ? + HT_IRQ_LOW_MT_FIXED : + HT_IRQ_LOW_MT_ARBITRATED) | + HT_IRQ_LOW_IRQ_MASKED; + + write_ht_irq_low(irq, low); + write_ht_irq_high(irq, high); + + set_irq_chip_and_handler(irq, &ht_irq_chip, handle_edge_irq); + } + return vector; +} +#endif /* CONFIG_HT_IRQ */ + /* -------------------------------------------------------------------------- ACPI-based IOAPIC Configuration -------------------------------------------------------------------------- */ diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 8d030b4..317439e 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -42,6 +42,7 @@ #include <asm/acpi.h> #include <asm/dma.h> #include <asm/nmi.h> #include <asm/msidef.h> +#include <asm/hypertransport.h> static int assign_irq_vector(int irq, cpumask_t mask); @@ -2093,6 +2094,101 @@ struct msi_ops arch_msi_ops = { #endif +/* + * Hypertransport interrupt support + */ +#ifdef CONFIG_HT_IRQ + +#ifdef CONFIG_SMP + +static void target_ht_irq(unsigned int irq, unsigned int dest, u8 vector) +{ + u32 low, high; + low = read_ht_irq_low(irq); + high = read_ht_irq_high(irq); + + low &= ~(HT_IRQ_LOW_VECTOR_MASK | HT_IRQ_LOW_DEST_ID_MASK); + high &= ~(HT_IRQ_HIGH_DEST_ID_MASK); + + low |= HT_IRQ_LOW_VECTOR(vector) | HT_IRQ_LOW_DEST_ID(dest); + high |= HT_IRQ_HIGH_DEST_ID(dest); + + write_ht_irq_low(irq, low); + write_ht_irq_high(irq, high); +} + +static void set_ht_irq_affinity(unsigned int irq, cpumask_t mask) +{ + unsigned int dest; + cpumask_t tmp; + int vector; + + cpus_and(tmp, mask, cpu_online_map); + if (cpus_empty(tmp)) + tmp = TARGET_CPUS; + + cpus_and(mask, tmp, CPU_MASK_ALL); + + vector = assign_irq_vector(irq, mask); + if (vector < 0) + return; + + cpus_clear(tmp); + cpu_set(vector >> 8, tmp); + dest = cpu_mask_to_apicid(tmp); + + target_ht_irq(irq, dest, vector & 0xff); + set_native_irq_info(irq, mask); +} +#endif + +static struct hw_interrupt_type ht_irq_chip = { + .name = "PCI-HT", + .mask = mask_ht_irq, + .unmask = unmask_ht_irq, + .ack = ack_apic_edge, +#ifdef CONFIG_SMP + .set_affinity = set_ht_irq_affinity, +#endif + .retrigger = ioapic_retrigger_irq, +}; + +int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev) +{ + int vector; + + vector = assign_irq_vector(irq, TARGET_CPUS); + if (vector >= 0) { + u32 low, high; + unsigned dest; + cpumask_t tmp; + + cpus_clear(tmp); + cpu_set(vector >> 8, tmp); + dest = cpu_mask_to_apicid(tmp); + + high = HT_IRQ_HIGH_DEST_ID(dest); + + low = HT_IRQ_LOW_BASE | + HT_IRQ_LOW_DEST_ID(dest) | + HT_IRQ_LOW_VECTOR(vector) | + ((INT_DEST_MODE == 0) ? + HT_IRQ_LOW_DM_PHYSICAL : + HT_IRQ_LOW_DM_LOGICAL) | + HT_IRQ_LOW_RQEOI_EDGE | + ((INT_DELIVERY_MODE != dest_LowestPrio) ? + HT_IRQ_LOW_MT_FIXED : + HT_IRQ_LOW_MT_ARBITRATED); + + write_ht_irq_low(irq, low); + write_ht_irq_high(irq, high); + + set_irq_chip_and_handler(irq, &ht_irq_chip, handle_edge_irq); + } + return vector; +} +#endif /* CONFIG_HT_IRQ */ + /* -------------------------------------------------------------------------- ACPI-based IOAPIC Configuration -------------------------------------------------------------------------- */ diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 4d762fc..3ec83ac 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -27,3 +27,13 @@ config PCI_DEBUG When in doubt, say N. +config HT_IRQ + bool "Interrupts on hypertransport devices" + default y + depends on PCI + depends on X86_LOCAL_APIC && X86_IO_APIC + help + This allows native hypertransport devices to use interrupts. + + If unsure say Y. + \ No newline at end of file diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile index 983d0f8..2752c57 100644 --- a/drivers/pci/Makefile +++ b/drivers/pci/Makefile @@ -26,6 +26,7 @@ obj-$(CONFIG_PPC32) += setup-irq.o obj-$(CONFIG_PPC64) += setup-bus.o obj-$(CONFIG_MIPS) += setup-bus.o setup-irq.o obj-$(CONFIG_X86_VISWS) += setup-irq.o +obj-$(CONFIG_HT_IRQ) += htirq.o msiobj-y := msi.o msiobj-$(CONFIG_IA64) += msi-apic.o diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c new file mode 100644 index 0000000..d8f894b --- /dev/null +++ b/drivers/pci/htirq.c @@ -0,0 +1,190 @@ +/* + * File: htirq.c + * Purpose: Hypertransport Interrupt Capability + * + * Copyright (C) 2006 Linux Networx + * Copyright (C) Eric Biederman <ebiederman@lnxi.com> + */ + +#include <linux/irq.h> +#include <linux/pci.h> +#include <linux/spinlock.h> +#include <linux/slab.h> +#include <linux/gfp.h> + +/* Global ht irq lock. + * + * This is needed to serialize access to the data port in hypertransport + * irq capability. + * + * With multiple simultaneous hypertransport irq devices it might pay + * to make this more fine grained. But start with simple, stupid, and correct. + */ +static DEFINE_SPINLOCK(ht_irq_lock); + +struct ht_irq_cfg { + struct pci_dev *dev; + unsigned pos; + unsigned idx; +}; + +void write_ht_irq_low(unsigned int irq, u32 data) +{ + struct ht_irq_cfg *cfg = get_irq_data(irq); + unsigned long flags; + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx); + pci_write_config_dword(cfg->dev, cfg->pos + 4, data); + spin_unlock_irqrestore(&ht_irq_lock, flags); +} + +void write_ht_irq_high(unsigned int irq, u32 data) +{ + struct ht_irq_cfg *cfg = get_irq_data(irq); + unsigned long flags; + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx + 1); + pci_write_config_dword(cfg->dev, cfg->pos + 4, data); + spin_unlock_irqrestore(&ht_irq_lock, flags); +} + +u32 read_ht_irq_low(unsigned int irq) +{ + struct ht_irq_cfg *cfg = get_irq_data(irq); + unsigned long flags; + u32 data; + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx); + pci_read_config_dword(cfg->dev, cfg->pos + 4, &data); + spin_unlock_irqrestore(&ht_irq_lock, flags); + return data; +} + + +u32 read_ht_irq_high(unsigned int irq) +{ + struct ht_irq_cfg *cfg = get_irq_data(irq); + unsigned long flags; + u32 data; + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx + 1); + pci_read_config_dword(cfg->dev, cfg->pos + 4, &data); + spin_unlock_irqrestore(&ht_irq_lock, flags); + return data; +} + +void mask_ht_irq(unsigned int irq) +{ + struct ht_irq_cfg *cfg; + unsigned long flags; + u32 data; + + cfg = get_irq_data(irq); + + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx); + pci_read_config_dword(cfg->dev, cfg->pos + 4, &data); + data |= 1; + pci_write_config_dword(cfg->dev, cfg->pos + 4, data); + spin_unlock_irqrestore(&ht_irq_lock, flags); +} + +void unmask_ht_irq(unsigned int irq) +{ + struct ht_irq_cfg *cfg; + unsigned long flags; + u32 data; + + cfg = get_irq_data(irq); + + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx); + pci_read_config_dword(cfg->dev, cfg->pos + 4, &data); + data &= ~1; + pci_write_config_dword(cfg->dev, cfg->pos + 4, data); + spin_unlock_irqrestore(&ht_irq_lock, flags); +} + +/** + * ht_create_irq - create an irq and attach it to a device. + * @dev: The hypertransport device to find the irq capability on. + * @idx: Which of the possible irqs to attach to. + * + * ht_create_irq is needs to be called for all hypertransport devices + * that generate irqs. + * + * The irq number of the new irq or a negative error value is returned. + */ +int ht_create_irq(struct pci_dev *dev, int idx) +{ + struct ht_irq_cfg *cfg; + unsigned long flags; + u32 data; + int max_irq; + int pos; + int irq; + + pos = pci_find_capability(dev, PCI_CAP_ID_HT); + while (pos) { + u8 subtype; + pci_read_config_byte(dev, pos + 3, &subtype); + if (subtype == HT_CAPTYPE_IRQ) + break; + pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_HT); + } + if (!pos) + return -EINVAL; + + /* Verify the idx I want to use is in range */ + spin_lock_irqsave(&ht_irq_lock, flags); + pci_write_config_byte(dev, pos + 2, 1); + pci_read_config_dword(dev, pos + 4, &data); + spin_unlock_irqrestore(&ht_irq_lock, flags); + + max_irq = (data >> 16) & 0xff; + if ( idx > max_irq) + return -EINVAL; + + cfg = kmalloc(sizeof(*cfg), GFP_KERNEL); + if (!cfg) + return -ENOMEM; + + cfg->dev = dev; + cfg->pos = pos; + cfg->idx = 0x10 + (idx * 2); + + irq = create_irq(); + if (irq < 0) { + kfree(cfg); + return -EBUSY; + } + set_irq_data(irq, cfg); + + if (arch_setup_ht_irq(irq, dev) < 0) { + ht_destroy_irq(irq); + return -EBUSY; + } + + return irq; +} + +/** + * ht_destroy_irq - destroy an irq created with ht_create_irq + * + * This reverses ht_create_irq removing the specified irq from + * existence. The irq should be free before this happens. + */ +void ht_destroy_irq(unsigned int irq) +{ + struct ht_irq_cfg *cfg; + + cfg = get_irq_data(irq); + set_irq_chip(irq, NULL); + set_irq_data(irq, NULL); + destroy_irq(irq); + + kfree(cfg); +} + +EXPORT_SYMBOL(ht_create_irq); +EXPORT_SYMBOL(ht_destroy_irq); diff --git a/include/asm-i386/hypertransport.h b/include/asm-i386/hypertransport.h new file mode 100644 index 0000000..c16c6ff --- /dev/null +++ b/include/asm-i386/hypertransport.h @@ -0,0 +1,42 @@ +#ifndef ASM_HYPERTRANSPORT_H +#define ASM_HYPERTRANSPORT_H + +/* + * Constants for x86 Hypertransport Interrupts. + */ + +#define HT_IRQ_LOW_BASE 0xf8000000 + +#define HT_IRQ_LOW_VECTOR_SHIFT 16 +#define HT_IRQ_LOW_VECTOR_MASK 0x00ff0000 +#define HT_IRQ_LOW_VECTOR(v) (((v) << HT_IRQ_LOW_VECTOR_SHIFT) & HT_IRQ_LOW_VECTOR_MASK) + +#define HT_IRQ_LOW_DEST_ID_SHIFT 8 +#define HT_IRQ_LOW_DEST_ID_MASK 0x0000ff00 +#define HT_IRQ_LOW_DEST_ID(v) (((v) << HT_IRQ_LOW_DEST_ID_SHIFT) & HT_IRQ_LOW_DEST_ID_MASK) + +#define HT_IRQ_LOW_DM_PHYSICAL 0x0000000 +#define HT_IRQ_LOW_DM_LOGICAL 0x0000040 + +#define HT_IRQ_LOW_RQEOI_EDGE 0x0000000 +#define HT_IRQ_LOW_RQEOI_LEVEL 0x0000020 + + +#define HT_IRQ_LOW_MT_FIXED 0x0000000 +#define HT_IRQ_LOW_MT_ARBITRATED 0x0000004 +#define HT_IRQ_LOW_MT_SMI 0x0000008 +#define HT_IRQ_LOW_MT_NMI 0x000000c +#define HT_IRQ_LOW_MT_INIT 0x0000010 +#define HT_IRQ_LOW_MT_STARTUP 0x0000014 +#define HT_IRQ_LOW_MT_EXTINT 0x0000018 +#define HT_IRQ_LOW_MT_LINT1 0x000008c +#define HT_IRQ_LOW_MT_LINT0 0x0000098 + +#define HT_IRQ_LOW_IRQ_MASKED 0x0000001 + + +#define HT_IRQ_HIGH_DEST_ID_SHIFT 0 +#define HT_IRQ_HIGH_DEST_ID_MASK 0x00ffffff +#define HT_IRQ_HIGH_DEST_ID(v) ((((v) >> 8) << HT_IRQ_HIGH_DEST_ID_SHIFT) & HT_IRQ_HIGH_DEST_ID_MASK) + +#endif /* ASM_HYPERTRANSPORT_H */ diff --git a/include/asm-x86_64/hypertransport.h b/include/asm-x86_64/hypertransport.h new file mode 100644 index 0000000..c16c6ff --- /dev/null +++ b/include/asm-x86_64/hypertransport.h @@ -0,0 +1,42 @@ +#ifndef ASM_HYPERTRANSPORT_H +#define ASM_HYPERTRANSPORT_H + +/* + * Constants for x86 Hypertransport Interrupts. + */ + +#define HT_IRQ_LOW_BASE 0xf8000000 + +#define HT_IRQ_LOW_VECTOR_SHIFT 16 +#define HT_IRQ_LOW_VECTOR_MASK 0x00ff0000 +#define HT_IRQ_LOW_VECTOR(v) (((v) << HT_IRQ_LOW_VECTOR_SHIFT) & HT_IRQ_LOW_VECTOR_MASK) + +#define HT_IRQ_LOW_DEST_ID_SHIFT 8 +#define HT_IRQ_LOW_DEST_ID_MASK 0x0000ff00 +#define HT_IRQ_LOW_DEST_ID(v) (((v) << HT_IRQ_LOW_DEST_ID_SHIFT) & HT_IRQ_LOW_DEST_ID_MASK) + +#define HT_IRQ_LOW_DM_PHYSICAL 0x0000000 +#define HT_IRQ_LOW_DM_LOGICAL 0x0000040 + +#define HT_IRQ_LOW_RQEOI_EDGE 0x0000000 +#define HT_IRQ_LOW_RQEOI_LEVEL 0x0000020 + + +#define HT_IRQ_LOW_MT_FIXED 0x0000000 +#define HT_IRQ_LOW_MT_ARBITRATED 0x0000004 +#define HT_IRQ_LOW_MT_SMI 0x0000008 +#define HT_IRQ_LOW_MT_NMI 0x000000c +#define HT_IRQ_LOW_MT_INIT 0x0000010 +#define HT_IRQ_LOW_MT_STARTUP 0x0000014 +#define HT_IRQ_LOW_MT_EXTINT 0x0000018 +#define HT_IRQ_LOW_MT_LINT1 0x000008c +#define HT_IRQ_LOW_MT_LINT0 0x0000098 + +#define HT_IRQ_LOW_IRQ_MASKED 0x0000001 + + +#define HT_IRQ_HIGH_DEST_ID_SHIFT 0 +#define HT_IRQ_HIGH_DEST_ID_MASK 0x00ffffff +#define HT_IRQ_HIGH_DEST_ID(v) ((((v) >> 8) << HT_IRQ_HIGH_DEST_ID_SHIFT) & HT_IRQ_HIGH_DEST_ID_MASK) + +#endif /* ASM_HYPERTRANSPORT_H */ diff --git a/include/linux/pci.h b/include/linux/pci.h index eb950d6..fd31b1c 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -672,6 +672,23 @@ extern int msi_register(struct msi_ops * #endif +#ifdef CONFIG_HT_IRQ +/* Helper functions.. */ +void write_ht_irq_low(unsigned int irq, u32 data); +void write_ht_irq_high(unsigned int irq, u32 data); +u32 read_ht_irq_low(unsigned int irq); +u32 read_ht_irq_high(unsigned int irq); +void mask_ht_irq(unsigned int irq); +void unmask_ht_irq(unsigned int irq); + +/* The functions a driver should call */ +int ht_create_irq(struct pci_dev *dev, int idx); +void ht_destroy_irq(unsigned int irq); + +/* The arch hook for getting things started */ +int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev); +#endif /* CONFIG_HT_IRQ */ + extern void pci_block_user_cfg_access(struct pci_dev *dev); extern void pci_unblock_user_cfg_access(struct pci_dev *dev); -- 1.4.1.gac83a ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-10 22:26 ` [PATCH 2/2] Initial generic hypertransport interrupt support Eric W. Biederman @ 2006-07-10 22:39 ` Benjamin Herrenschmidt 2006-07-11 3:51 ` Eric W. Biederman 2006-07-11 22:27 ` Andi Kleen 1 sibling, 1 reply; 22+ messages in thread From: Benjamin Herrenschmidt @ 2006-07-10 22:39 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Dave Olson, linux-kernel On Mon, 2006-07-10 at 16:26 -0600, Eric W. Biederman wrote: > This patch implements two functions ht_create_irq and ht_destroy_irq > for use by drivers. Several other functions are implemented as helpers > for arch specific irq_chip handlers. > > The driver for the card I tested this on isn't yet ready to be merged. > However this code is and hypertransport irqs are in use in a few other > places in the kernel. Not that any of this will get merged before > 2.6.19 > > Because the ipath-ht400 is slightly out of spec this code will need > to be generalized to work there. > > I think all of the powerpc uses are for a plain interrupt controller > in a chipset so support for native hypertransport devices is a little > less interesting. At this point, the only PPCs with HT interrupts that I know are 970 based solutions using the Apple U3/U4 bridges (and their IBM counterparts). Thus all HT interrupts are routed to the MPIC as sources, so things like masking, affinity, etc... are all handled at the MPIC level, not at the HT level and they all originate from either an Apple home made HT APIC or standard HT APICs in PCI-X/E tunnels. We still need to poke around with the HT APICs for configuration and EOI on level interrupts (due to a bug in the Apple MPIC, the EOI doesn't get sent back to the HT APIC) but we have code locally in the MPIC driver to do it and I don't think it would fit well a generic layer. Things might be different in the future... but for now, I'm afraid I have no use of that HT layer. Cheers Ben. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-10 22:39 ` Benjamin Herrenschmidt @ 2006-07-11 3:51 ` Eric W. Biederman 2006-07-11 5:20 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-11 3:51 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Andrew Morton, Dave Olson, linux-kernel Benjamin Herrenschmidt <benh@kernel.crashing.org> writes: > At this point, the only PPCs with HT interrupts that I know are 970 > based solutions using the Apple U3/U4 bridges (and their IBM > counterparts). Thus all HT interrupts are routed to the MPIC as sources, > so things like masking, affinity, etc... are all handled at the MPIC > level, not at the HT level and they all originate from either an Apple > home made HT APIC or standard HT APICs in PCI-X/E tunnels. We still need > to poke around with the HT APICs for configuration and EOI on level > interrupts (due to a bug in the Apple MPIC, the EOI doesn't get sent > back to the HT APIC) but we have code locally in the MPIC driver to do > it and I don't think it would fit well a generic layer. > > Things might be different in the future... but for now, I'm afraid I > have no use of that HT layer. I didn't really expect you to have an immediate use, but the confirmation is nice. The interesting part is how I have factored out the arch specific details. I believe this is close to the direction you envisioned for msi. If you could look at the basic structure and confirm that the structure looks properly arch neutral that would be appreciated. As time permits I want to make the msi code look more the this hypertransport irq code. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 3:51 ` Eric W. Biederman @ 2006-07-11 5:20 ` Benjamin Herrenschmidt 2006-07-11 6:29 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Benjamin Herrenschmidt @ 2006-07-11 5:20 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Dave Olson, linux-kernel > I didn't really expect you to have an immediate use, but the > confirmation is nice. The interesting part is how I have factored out > the arch specific details. I believe this is close to the direction > you envisioned for msi. If you could look at the basic structure > and confirm that the structure looks properly arch neutral that > would be appreciated. As time permits I want to make the msi code > look more the this hypertransport irq code. Ok, I'll try to have a second look in the plane :) (I'm flying off tomorrow). If you are interested in the new IRQ infrastructure I did for powerpc, it's now upstream (read comments in arch/powerpc/kernel/irq.c). My idea for MSIs etc... is that I've completley disconnected linux interrupt numbers and HW vector numbers for a given controller. Thus I can allocate linux virtual irq numbers (including linear ranges of them if we ever want to handle multiple MSIs (not MSI-X) etc.. However, it's up to a given irq controller to handle it's own allocation of vectors for those MSIs. Some controllers may have a fixed set of vectors reserved for MSIs, or in some cases, like pSeries LPAR, the firmware controls everything and just hands me a bunch of vectors for devices, etc... That also mean that the allocation of vectors for a given controller that can do MSI will be handled completely locally to the driver of that controller. For example, on the Quad G5, I will have the MPIC driver locally have a bitmap of what irq sources are used by LSIs and HT interrupts coming from IO-APICs (pretty much the same as LSIs as far as the driver is concerned) and will implement a local allocator to handle the allocation of the remaining sources by MSI capable devices. One problem I have at this point is with multiple MSIs. Our current interface, pci_enable_msi(), is pretty much documented as only ever enabling one MSI and I don't quite understand why we did that in the first place. It seems ok to me to define that it can enable 2^N MSIs as requested by the device with consecutive numbers starting at pdev->irq. Now if we don't want to change that, I would propose that we define a new interface, pci_enable_msi_multi(pdev, order, array) to which we can pass a requested count (order) which has to be <= to the device number of requested MSIs (so the driver can ask for less if it knows it doesn't need as much as the device asks for) and the actual interrupts infos are returned via an array identical to the one used for MSI-X. That way, even if hardware MSI vectors still have to be naturally aligned on the number requested and consecutive, the returned linux IRQ numbers don't have to. In addition, by holding the list of IRQs in the same data structure as MSI-X, this simplifies drivers who want to implement both modes. I don't have time to work much on the MSI aspect of things before OLS though and I'll be away after that until end of august with only intermittent internet access, so we'll talk about it again either at OLS itself if we find some time, or when I'm back. Cheers, Ben. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 5:20 ` Benjamin Herrenschmidt @ 2006-07-11 6:29 ` Eric W. Biederman 2006-07-11 7:29 ` Segher Boessenkool 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-11 6:29 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Andrew Morton, Dave Olson, linux-kernel Benjamin Herrenschmidt <benh@kernel.crashing.org> writes: >> I didn't really expect you to have an immediate use, but the >> confirmation is nice. The interesting part is how I have factored out >> the arch specific details. I believe this is close to the direction >> you envisioned for msi. If you could look at the basic structure >> and confirm that the structure looks properly arch neutral that >> would be appreciated. As time permits I want to make the msi code >> look more the this hypertransport irq code. > > Ok, I'll try to have a second look in the plane :) (I'm flying off > tomorrow). If you are interested in the new IRQ infrastructure I did for > powerpc, it's now upstream (read comments in arch/powerpc/kernel/irq.c). Ok. Thanks. If I stay sufficient stuck in irq land I will. I keep hoping to find the end of this project :) This all started when I looked at some problem with irqs and say hey that is easy to fix you just do this.... > I don't have time to work much on the MSI aspect of things before OLS > though and I'll be away after that until end of august with only > intermittent internet access, so we'll talk about it again either at OLS > itself if we find some time, or when I'm back. Sure, it looks like I at least get to play care taker of this code for a while. I think the separation of linux irq numbers and hardware addresses seems fairly sane. Having something you can get a grip on is nice but that really doesn't work for things like msi's. Unless we move to a completely sparse architecture where the irq number is: ((((domain*256)+bus)*256)+devfn)*1024 As for supporting multiple irqs in plain MSI mode, I don't think we want to do that. The problem is that multiple interrupts in msi mode cannot be individually routed. I think we really want to encourage vendors who are building cards with multiple MSI irqs to use MSI-X. MSI-X has a lot fewer ugly special cases and all architectures can individually route the irqs. If there are interesting cards that support just MSI mode and really need more than one irq I would be happy to reconsider that decision but my impression was that plain MSI was basically not quite flexible enough to really be interesting, and supporting just one MSI irq was ok but any more would lead to all kinds of strange special cases. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 6:29 ` Eric W. Biederman @ 2006-07-11 7:29 ` Segher Boessenkool 2006-07-11 7:48 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Segher Boessenkool @ 2006-07-11 7:29 UTC (permalink / raw) To: Eric W. Biederman Cc: Benjamin Herrenschmidt, Andrew Morton, Dave Olson, linux-kernel > As for supporting multiple irqs in plain MSI mode, I don't think > we want to do that. The problem is that multiple interrupts > in msi mode cannot be individually routed. On some(/many/most) platforms that isn't a problem. Platforms for which it is can just refuse to allocate more than one MSI at once. > I think we really want > to encourage vendors who are building cards with multiple MSI irqs > to use MSI-X. MSI-X has a lot fewer ugly special cases and all > architectures can individually route the irqs. We still should support whatever hardware already exists, if possible. > If there are interesting cards that support just MSI mode and really > need more than one irq I would be happy to reconsider that decision > but my impression was that plain MSI was basically not quite flexible > enough to really be interesting, and supporting just one MSI irq was > ok but any more would lead to all kinds of strange special cases. Individual drivers can deal with those special cases if they are device- specific; and the platform can just refuse to do more than one MSI if something platform-specific would prevent correct operation. It would be nice to have the MSI and MSI-X interfaces have the same calling convention; in fact, they can probably be folded into one. Segher ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 7:29 ` Segher Boessenkool @ 2006-07-11 7:48 ` Eric W. Biederman 2006-07-11 9:15 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-11 7:48 UTC (permalink / raw) To: Segher Boessenkool Cc: Benjamin Herrenschmidt, Andrew Morton, Dave Olson, linux-kernel Segher Boessenkool <segher@kernel.crashing.org> writes: >> As for supporting multiple irqs in plain MSI mode, I don't think >> we want to do that. The problem is that multiple interrupts >> in msi mode cannot be individually routed. > > On some(/many/most) platforms that isn't a problem. Platforms > for which it is can just refuse to allocate more than one MSI > at once. It is a problem on all platforms that currently implement MSI. >> I think we really want >> to encourage vendors who are building cards with multiple MSI irqs >> to use MSI-X. MSI-X has a lot fewer ugly special cases and all >> architectures can individually route the irqs. > > We still should support whatever hardware already exists, if > possible. Which hardware is this a problem for? MSI and MSI-X only guarantee the availability of 1 irq if I recall correctly. More are a bonus so cards should be able to fall back to a single irq mode. >> If there are interesting cards that support just MSI mode and really >> need more than one irq I would be happy to reconsider that decision >> but my impression was that plain MSI was basically not quite flexible >> enough to really be interesting, and supporting just one MSI irq was >> ok but any more would lead to all kinds of strange special cases. > > Individual drivers can deal with those special cases if they are device- > specific; and the platform can just refuse to do more than one MSI if > something platform-specific would prevent correct operation. > > It would be nice to have the MSI and MSI-X interfaces have the same > calling convention; in fact, they can probably be folded into one. Examples? details? patches? Part of the problem with plain MSI is that you can't mask irqs at the source, in a generic way. How do your ideas compare with my hypertransport irq implementation? Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 7:48 ` Eric W. Biederman @ 2006-07-11 9:15 ` Benjamin Herrenschmidt 2006-07-11 19:56 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Benjamin Herrenschmidt @ 2006-07-11 9:15 UTC (permalink / raw) To: Eric W. Biederman Cc: Segher Boessenkool, Andrew Morton, Dave Olson, linux-kernel > Examples? details? patches? > > Part of the problem with plain MSI is that you can't mask irqs at the > source, in a generic way. This is an interesting point, because this shows precisely the different of approach between most HW PPC implementations vs what x86 does and what the current code does... Our MSIs are always routed as just additional sources to an existing IRQ controller that will itself then handle things like masking, affinity, etc... Thus, we don't need nor want any kind of generic MSI code setting up an irq_chip with enable/disable functions etc... Those should stay the ones from the system's main PIC, maybe a different version of them, but that's up to the system PIC to set that up. That's one of the reason why I think we need to work the MSI arch side API such that the MSI "controller" is the one to setup the irq_desc. The "generic" mask/unmask/etc... using the MSI(X) config space can be provided as optional helpers but it should be under arch, or more specifically MSI controller control to pick up how to setup the irq_desc and its associated irq_chip. Another one is the fact that we have multiple different MSI mecanisms on the same machines (like the Apple Quad G5 which have on one side an MSI "register" that devices write to and that triggers sources on the MPIC, and on the other side, HT interrupts which can be generated from MSIs by the broadcom HT<->PCIe bridge). Thus that msi_ops stucture I've seen around shall really be per PCI host bridge at the very least. One propsal I have, but I didn't have time to actually implement it, was to have the msi_ops pointer be a field in pci_bus that is inherited by default. That is, the arch can call pci_set_msi_ops() on a given bus and this will propagate to childs. Anyway, as I said, I didn't have time to code that part. So it's mostly food for thoughts. Ben. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 9:15 ` Benjamin Herrenschmidt @ 2006-07-11 19:56 ` Eric W. Biederman 2006-07-11 22:18 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-11 19:56 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Segher Boessenkool, Andrew Morton, Dave Olson, linux-kernel Benjamin Herrenschmidt <benh@kernel.crashing.org> writes: >> Examples? details? patches? >> >> Part of the problem with plain MSI is that you can't mask irqs at the >> source, in a generic way. > > This is an interesting point, because this shows precisely the different > of approach between most HW PPC implementations vs what x86 does and > what the current code does... > > Our MSIs are always routed as just additional sources to an existing IRQ > controller that will itself then handle things like masking, affinity, > etc... Interesting. In the short term I can see how that is a sane design. In the long term that appears to require more hardware and be a bottle neck on the number of irqs you can have so I don't expect the current ppc model to be the long term one. Does the current ppc model give hypervisors more control then they would get without going through an existing interrupt controller? > Thus, we don't need nor want any kind of generic MSI code setting up an > irq_chip with enable/disable functions etc... Those should stay the ones > from the system's main PIC, maybe a different version of them, but > that's up to the system PIC to set that up. For the hypertransport case I have coded it this way and I think there are good general cleanliness arguments for making this change to x86 as well. I think doing so actually winds up being less code. > That's one of the reason why I think we need to work the MSI arch side > API such that the MSI "controller" is the one to setup the irq_desc. The > "generic" mask/unmask/etc... using the MSI(X) config space can be > provided as optional helpers but it should be under arch, or more > specifically MSI controller control to pick up how to setup the irq_desc > and its associated irq_chip. > > Another one is the fact that we have multiple different MSI mecanisms on > the same machines (like the Apple Quad G5 which have on one side an MSI > "register" that devices write to and that triggers sources on the MPIC, > and on the other side, HT interrupts which can be generated from MSIs by > the broadcom HT<->PCIe bridge). Thus that msi_ops stucture I've seen > around shall really be per PCI host bridge at the very least. One > propsal I have, but I didn't have time to actually implement it, was to > have the msi_ops pointer be a field in pci_bus that is inherited by > default. That is, the arch can call pci_set_msi_ops() on a given bus and > this will propagate to childs. So to be very concise what I did on the HT side is that I have a function: arch_setup_ht_irq(struct pci_dev, int irq). That is used by the arch code to setup the irq_chip handler and the rest. There are helper functions: void write_ht_irq_low(unsigned int irq, u32 data); void write_ht_irq_high(unsigned int irq, u32 data); u32 read_ht_irq_low(unsigned int irq); u32 read_ht_irq_high(unsigned int irq); That read and write the individual hypertransport irq routing values per irq. I suspect this is the right general direction to move the msi code to. The combination allows an architecture to take bus specific details into an account, and have it's own irq handler methods. At the same time it is still the generic infrastructure controlling access to the chip registers. So I honest expect things like msi_ops to become an arch specific detail. At the very least I think it is too early to start generalizing that way if we really don't need to. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 19:56 ` Eric W. Biederman @ 2006-07-11 22:18 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 22+ messages in thread From: Benjamin Herrenschmidt @ 2006-07-11 22:18 UTC (permalink / raw) To: Eric W. Biederman Cc: Segher Boessenkool, Andrew Morton, Dave Olson, linux-kernel On Tue, 2006-07-11 at 13:56 -0600, Eric W. Biederman wrote: > Interesting. In the short term I can see how that is a sane design. > In the long term that appears to require more hardware and be a > bottle neck on the number of irqs you can have so I don't expect > the current ppc model to be the long term one. > > Does the current ppc model give hypervisors more control then they > would get without going through an existing interrupt controller? Well, it's actually fairly scalable as the interrupt "controller" itself isn't a always single chip design. For example, on pSeries, there are interrupt "source" controllers scattered around the fabric, in the PCI bridges that connect to LSI IRQs and are recipients of MSIs as well. All of these then send the actual interrupt events as messages to one or more "presentation" controller(s) attached to the CPUs. The protocol between those is not visible to the operating system. One of the reasons the above is especially interesting is that MSI don't have to travel all the way through the fabric via the mormal bus protocols. Which means that they potentially lose ordering vs. normal transactions passed the first or second level of PCI bridges. But that "ordering" of MSIs has always been a bit dodgy anyway, since it's really fully enforceable if your MSI recipient is also your toplevel bridge, which simply can't always be the case. At least you get a guarantee at the device level. Regarding hypervisors, on pSeries, the whole thing is completely abstracted (including the interrupt presentation controller). Basically, the HV assigns MSIs or MSI-X to devices and tells you how much it did and what "vectors" at the toplevel controller were assigned to a given device. You have some calls to disable them or try to change the amount of assigned MSIs but that's it. HV is the one doing the config space stuff, enabling MSIs, choosing vector numbers, etc...We just use HV/RTAS calls to mask/unmask/set-affinity of a given interrupts wether it's an MSI or an LSI .../... > The combination allows an architecture to take bus specific details into > an account, and have it's own irq handler methods. At the same time > it is still the generic infrastructure controlling access to the > chip registers. > Well, on HV machinesm we can't even use that infrastructure since we aren't supposed to write to the chip registers even. That's why we have that very preliminary patch that allows powerpc, for now, to just hook at the toplevel of the implementation (re-implement pci_enable_msi() basically) so we can handle the HV case. But, we still have to find a way to have a given kernel handle both "more classical" MSI setups (where we have to program the chip config space & assign vectors ourselves) and the "hypervisor" scheme. On PowerMac, and other machines using the Apple U4 / IBM CPC945 bridge, things are a bit different. On those, the MSIs coming from the PCIe segment land into a register that simply turns stores it receives into triggers of a line of the MPIC interrupt controller in that bridge (that value stored is the line number). But that register, for some stupid reasons, doesn't work when accessed from the HyperTransport link (it has a wrong endian on it). So for MSIs on devices on HT<->PCIe tunnels, we need to use whatever facilities are provided by those tunnels to turn them into HT interrupts. Appart from that, HT interrupts are treated the same way: there is a "magic" register in the bridge that turns them into sources for the MPIC. So the later is more "classical", though it still doesn't fit the IO-APIC model where you need to mask/unmask at the controller level. But there is some more weird hardware in the ppc world. Some stuffs I'm not sure I can talk about much but who handles MSIs but storing the messages in a hardware FIFO and sending one upstream interrupt when there is something in there. That hardware can't mask/unmask a given MSI so manipulating the config space to do so (with MSI-X at least) might makes sense (though MSIs being naturally "edge" messages, one can always just drop them and use the retrigger mecanism of genirq to mask them, might be more efficient than masking at the device... depends how likely it is to actually take an MSI while "masked"). > So I honest expect things like msi_ops to become an arch specific detail. > At the very least I think it is too early to start generalizing that way > if we really don't need to. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-10 22:26 ` [PATCH 2/2] Initial generic hypertransport interrupt support Eric W. Biederman 2006-07-10 22:39 ` Benjamin Herrenschmidt @ 2006-07-11 22:27 ` Andi Kleen 2006-07-12 3:05 ` Eric W. Biederman 1 sibling, 1 reply; 22+ messages in thread From: Andi Kleen @ 2006-07-11 22:27 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Olson, linux-kernel, Benjamin Herrenschmidt, discuss ebiederm@xmission.com (Eric W. Biederman) writes: > This patch implements two functions ht_create_irq and ht_destroy_irq > for use by drivers. Several other functions are implemented as helpers > for arch specific irq_chip handlers. What do you want to use it for? Normally all HT configuration should be handled by the BIOS and not messed with by the kernel. -Andi ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-11 22:27 ` Andi Kleen @ 2006-07-12 3:05 ` Eric W. Biederman 2006-07-12 6:10 ` Dave Olson 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-12 3:05 UTC (permalink / raw) To: Andi Kleen; +Cc: Dave Olson, linux-kernel, Benjamin Herrenschmidt, discuss Andi Kleen <ak@suse.de> writes: > ebiederm@xmission.com (Eric W. Biederman) writes: > >> This patch implements two functions ht_create_irq and ht_destroy_irq >> for use by drivers. Several other functions are implemented as helpers >> for arch specific irq_chip handlers. > > What do you want to use it for? Normally all HT configuration should be handled > by the BIOS and not messed with by the kernel. I don't believe this is a typical HT configuration as you are thinking of it. There is a hypertransport capability that implements a rough equivalent of a per device ioapic. It is quite similar to MSI but with a different register level interface. Since native hypertransport devices do not implement a pin emulation mode as native pci express devices do so if you want an interrupt you must support the native hypertransport method. The pathscale ipath-ht400 driver already in the kernel tree uses these and uses so an ugly hack to make work that broke in the last round of the msi cleanups. I also know of a driver under development for a device that uses these as well. So I want to use this so I can get irqs from native hypertransport devices. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-12 3:05 ` Eric W. Biederman @ 2006-07-12 6:10 ` Dave Olson 2006-07-12 6:56 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Dave Olson @ 2006-07-12 6:10 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss On Tue, 11 Jul 2006, Eric W. Biederman wrote: | There is a hypertransport capability that implements a rough equivalent | of a per device ioapic. It is quite similar to MSI but with a different | register level interface. It's really just the same as MSI, and is set up and handled pretty much the same way. | Since native hypertransport devices do not implement a pin emulation mode | as native pci express devices do so if you want an interrupt you must support | the native hypertransport method. Right. | The pathscale ipath-ht400 driver already in the kernel tree uses these | and uses so an ugly hack to make work that broke in the last round of | the msi cleanups. I also know of a driver under development for a | device that uses these as well. Umm, it's not broken by any of the the MSI cleanups, at least through last week's 2.6.18. | So I want to use this so I can get irqs from native hypertransport | devices. This part I never really quite understood. Why do you want a separate interface than the existing request_irq() and pci_enable_msi()? Yes, there needs to be some HT-specific implementation behind it, but I don't see a reason for a whole new interface. Most of the rest of the HT stuff is setup via the pci_* functions, so why not the interrupts? Dave Olson olson@unixfolk.com http://www.unixfolk.com/dave ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-12 6:10 ` Dave Olson @ 2006-07-12 6:56 ` Eric W. Biederman 2006-07-13 3:56 ` Dave Olson 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-12 6:56 UTC (permalink / raw) To: Dave Olson; +Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss Dave Olson <olson@unixfolk.com> writes: > On Tue, 11 Jul 2006, Eric W. Biederman wrote: > | There is a hypertransport capability that implements a rough equivalent > | of a per device ioapic. It is quite similar to MSI but with a different > | register level interface. > > It's really just the same as MSI, and is set up and handled pretty > much the same way. No it is not just the same. There is not global enable bit, only per irq enables. There is always a mask bit. The ht irq generates a 0 byte (with magic defines for the 32 byte enables) write while an msi generates a 4 byte write with no byte enables. With ht irqs the maximum number of irqs is 120 not the one with plain MSI or the 4k with MSI-X. But from the perspective of using them in a driver the concept really is the same. > | Since native hypertransport devices do not implement a pin emulation mode > | as native pci express devices do so if you want an interrupt you must support > | the native hypertransport method. > > Right. > > | The pathscale ipath-ht400 driver already in the kernel tree uses these > | and uses so an ugly hack to make work that broke in the last round of > | the msi cleanups. I also know of a driver under development for a > | device that uses these as well. > > Umm, it's not broken by any of the the MSI cleanups, at least > through last week's 2.6.18. The code that breaks it is only in -mm. It's scheduled for 2.6.19. All of the MSI magic in ioapic land on i386 and x86_64 is deleted. The code just needs to age a bit and let the few unexpected corner case crop up, and get sorted out. Hopefully fixing the ipath driver is one of the things we can sort out. > | So I want to use this so I can get irqs from native hypertransport > | devices. > > This part I never really quite understood. Why do you want a separate > interface than the existing request_irq(). request_irq is still needed. The question is how do you get the irq. > and pci_enable_msi()? The HT and msi semantics are moderately different, but I have implemented the equivalent of pci_enable/disable_msi. So the code is not a pci standard but just a ht standard I didn't use the pci prefix. > Yes, > there needs to be some HT-specific implementation behind it, but I > don't see a reason for a whole new interface. Most of the rest of > the HT stuff is setup via the pci_* functions, so why not the interrupts? The reason I did not reuse code from msi.c is that the code in msi.c is absolutely terrible. Also note that even the different flavors of msi have their own enable/disable routines. I expect I will make msi.c match htirq.c instead of the other way around. Of course I don't expect the interface exported to drivers to change. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-12 6:56 ` Eric W. Biederman @ 2006-07-13 3:56 ` Dave Olson 2006-07-13 15:13 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Dave Olson @ 2006-07-13 3:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss On Wed, 12 Jul 2006, Eric W. Biederman wrote: | Dave Olson <olson@unixfolk.com> writes: | | > On Tue, 11 Jul 2006, Eric W. Biederman wrote: | > | There is a hypertransport capability that implements a rough equivalent | > | of a per device ioapic. It is quite similar to MSI but with a different | > | register level interface. | > | > It's really just the same as MSI, and is set up and handled pretty | > much the same way. | | No it is not just the same. There is not global enable bit, only ... | But from the perspective of using them in a driver the concept really | is the same. I meant, from the perspective of a driver. Sorry for not being clear. | The code that breaks it is only in -mm. It's scheduled for 2.6.19. | All of the MSI magic in ioapic land on i386 and x86_64 is deleted. | The code just needs to age a bit and let the few unexpected | corner case crop up, and get sorted out. Well, if that set of patches is accepted into 2.6.19, it will likely break other people with HyperTransport chips and cards as well. I do know of other HTX-slot cards in development, and some of them, at least, do use interrupts. So I think it needs to age enough to not break existing drivers. Unfortunately, I still haven't had time to try your HT path with our ipath driver, because I'm in fire-fighting mode on some customer issues. You might be able to test it on some of your LS-1 systems in your lab, if you need early feedback. I'll try it, as soon as I can. | > This part I never really quite understood. Why do you want a separate | > interface than the existing request_irq(). | | request_irq is still needed. The question is how do you get the irq. | | > and pci_enable_msi()? | | The HT and msi semantics are moderately different, but I have | implemented the equivalent of pci_enable/disable_msi. So the | code is not a pci standard but just a ht standard I didn't use the | pci prefix. My point was that many other pci_* functions have to be called by any driver for HyperTransport-attached chips (that aren't chipset), so why break these out separately, rather than somehow fitting them into the existing routines, perhaps by looking at the bus-type the device is attached to? Unless we add a full set of ht_* routines (not something I'd like to see!) I don't see a reason to do it for just these routines, other than convenience for early testing of patch proposals, as opposed to final code going into the mainline. I'm not suggesting re-using existing (old) msi.c code. I'm suggesting making the new MSI code work for all 3 of PCI, PCIe, and HT, with the same exact exported routine. | Of course I don't expect the interface exported to drivers to change. But if you add new ht_* routines that they have to call, that's definitely a change, a new set of exported routines. I'm questioning whether that's really the right direction. If you had already intended to have them not exported in the final patches, that wasn't clear to me, and I apologize for misunderstanding. Dave Olson dave.olson@qlogic.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 3:56 ` Dave Olson @ 2006-07-13 15:13 ` Eric W. Biederman 2006-07-13 18:15 ` Dave Olson 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-13 15:13 UTC (permalink / raw) To: olson; +Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss Dave Olson <olson@pathscale.com> writes: > On Wed, 12 Jul 2006, Eric W. Biederman wrote: > > | Dave Olson <olson@unixfolk.com> writes: > | > | > On Tue, 11 Jul 2006, Eric W. Biederman wrote: > | > | There is a hypertransport capability that implements a rough equivalent > | > | of a per device ioapic. It is quite similar to MSI but with a different > | > | register level interface. > | > > | > It's really just the same as MSI, and is set up and handled pretty > | > much the same way. > | > | No it is not just the same. There is not global enable bit, only > ... > | But from the perspective of using them in a driver the concept really > | is the same. > > I meant, from the perspective of a driver. Sorry for not being clear. Yes. Once you figure out which one you want the concept and get an irq using the in the driver is the same. The driver does need to be aware of the details so it knows which interrupt it is generating though. > | The code that breaks it is only in -mm. It's scheduled for 2.6.19. > | All of the MSI magic in ioapic land on i386 and x86_64 is deleted. > | The code just needs to age a bit and let the few unexpected > | corner case crop up, and get sorted out. > > Well, if that set of patches is accepted into 2.6.19, it will likely > break other people with HyperTransport chips and cards as well. I do > know of other HTX-slot cards in development, and some of them, at least, > do use interrupts. And I tested it on one of them. The problem is that there is no API in the kernel for properly handling hypertransport interrupts or even faking it well currently. There is no shame in breaking a bad unmaintainable hack, as I did. The responsible thing is to when you find one to fix up the code so that things work by design in a maintainable way, which I am attempting to do. > So I think it needs to age enough to not break existing drivers. All existing drivers that use HT interrupts are broken by design. Nor do out of tree kernel drivers count, because the authors are not working with the community. I can almost use the fact that the code breaks out of tree drivers as an argument for mainstream kernel inclusion. > Unfortunately, I still haven't had time to try your HT path with our > ipath driver, because I'm in fire-fighting mode on some customer issues. > You might be able to test it on some of your LS-1 systems in your lab, > if you need early feedback. I'll try it, as soon as I can. Sure. In that case can I please have a good description of what weird hacks your hardware designers have done. As I understand it I cannot write to the standard registers HT capability registers and have things work correctly. > > | > This part I never really quite understood. Why do you want a separate > | > interface than the existing request_irq(). > | > | request_irq is still needed. The question is how do you get the irq. > | > | > and pci_enable_msi()? > | > | The HT and msi semantics are moderately different, but I have > | implemented the equivalent of pci_enable/disable_msi. So the > | code is not a pci standard but just a ht standard I didn't use the > | pci prefix. > > My point was that many other pci_* functions have to be called by any > driver for HyperTransport-attached chips (that aren't chipset), so why > break these out separately, rather than somehow fitting them into the > existing routines, perhaps by looking at the bus-type the device is > attached to? Until you have time to look and think about this I'm not prepared to discuss this. All I did was add code to work with the standard ht irq capability. > Unless we add a full set of ht_* routines (not something I'd like to see!) > I don't see a reason to do it for just these routines, other than > convenience for early testing of patch proposals, as opposed to > final code going into the mainline. > > I'm not suggesting re-using existing (old) msi.c code. I'm suggesting > making the new MSI code work for all 3 of PCI, PCIe, and HT, with the > same exact exported routine. msi and msi-x are different, so to is ht. There is no real commonality there. > | Of course I don't expect the interface exported to drivers to change. > > But if you add new ht_* routines that they have to call, that's > definitely a change, a new set of exported routines. I'm questioning > whether that's really the right direction. If you had already intended > to have them not exported in the final patches, that wasn't clear to me, > and I apologize for misunderstanding. The functions I exported I intend to export. The complaint seems to be that you don't have anything that will work on earlier kernels. I have to agree you don't. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 15:13 ` Eric W. Biederman @ 2006-07-13 18:15 ` Dave Olson 2006-07-13 18:41 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Dave Olson @ 2006-07-13 18:15 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss On Thu, 13 Jul 2006, Eric W. Biederman wrote: | And I tested it on one of them. The problem is that there is no API in | the kernel for properly handling hypertransport interrupts or even faking | it well currently. There is no shame in breaking a bad unmaintainable | hack, as I did. The responsible thing is to when you find one to | fix up the code so that things work by design in a maintainable way, | which I am attempting to do. There's no problem providing a better replacement, just make sure that the existing drivers can use it. | All existing drivers that use HT interrupts are broken by design. That's a statement designed to provoke arguments, and I'll just leave it that we disagree. | Sure. In that case can I please have a good description of what | weird hacks your hardware designers have done. There's really nothing special at all about the interrupt setup, except in one very minor way. The value of the HT interrupt destination address needs to be copied from HT config space, to an internal chip register (which is, can, and should be, handled by the driver init code). | As I understand | it I cannot write to the standard registers HT capability registers | and have things work correctly. There is nothing unusual or special about that part at all. The only unusual item is that mentioned in the paragraph above. | The functions I exported I intend to export. The complaint seems to | be that you don't have anything that will work on earlier kernels. | I have to agree you don't. Huh? I didn't say anything that could possibly be read as applying to earlier kernels, and to be crystal clear, that's not my concern at all. Maybe somebody else can articulate what I'm trying to say, but I don't think I can say it in a clearer way. Dave Olson dave.olson@qlogic.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 18:15 ` Dave Olson @ 2006-07-13 18:41 ` Eric W. Biederman 2006-07-13 19:00 ` Dave Olson 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-13 18:41 UTC (permalink / raw) To: olson; +Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss Dave Olson <olson@pathscale.com> writes: > On Thu, 13 Jul 2006, Eric W. Biederman wrote: > | And I tested it on one of them. The problem is that there is no API in > | the kernel for properly handling hypertransport interrupts or even faking > | it well currently. There is no shame in breaking a bad unmaintainable > | hack, as I did. The responsible thing is to when you find one to > | fix up the code so that things work by design in a maintainable way, > | which I am attempting to do. > > There's no problem providing a better replacement, just make sure > that the existing drivers can use it. I am working on that. What broke the drivers was the removal of the assumption that irq == vector. Which was a bad hack and never always true. > | All existing drivers that use HT interrupts are broken by design. > > That's a statement designed to provoke arguments, and I'll just leave > it that we disagree. I am just saying that if you don't have the infrastructure you need you cannot implement things correctly. It is no fault of the driver authors. Except possibly not standing up and asking for some infrastructure to do what needs to be done. > | Sure. In that case can I please have a good description of what > | weird hacks your hardware designers have done. > > There's really nothing special at all about the interrupt > setup, except in one very minor way. The value of the HT interrupt > destination address needs to be copied from HT config space, to > an internal chip register (which is, can, and should be, handled by > the driver init code). The kernel changes the value at runtime, based upon user input. I assume your mirror register needs to be updated after every change. Since the kernel changes the value at runtime, and since a different register needs to be written to, I can't quite use the generic code I have written as is. > | The functions I exported I intend to export. The complaint seems to > | be that you don't have anything that will work on earlier kernels. > | I have to agree you don't. > > Huh? I didn't say anything that could possibly be read as applying > to earlier kernels, and to be crystal clear, that's not my concern > at all. > > Maybe somebody else can articulate what I'm trying to say, but I don't > think I can say it in a clearer way. Ok. I guess I was reading something in that wasn't there. So see a similarity and say it looks like that can be generalized. I look and I don't see what having a magic DWIM enable irq interface would help with. The counter argument is essentially that if you have multiple options implemented but some of them are buggy Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 18:41 ` Eric W. Biederman @ 2006-07-13 19:00 ` Dave Olson 2006-07-13 19:20 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Dave Olson @ 2006-07-13 19:00 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss On Thu, 13 Jul 2006, Eric W. Biederman wrote: | > There's really nothing special at all about the interrupt | > setup, except in one very minor way. The value of the HT interrupt | > destination address needs to be copied from HT config space, to | > an internal chip register (which is, can, and should be, handled by | > the driver init code). | | The kernel changes the value at runtime, based upon user input. | I assume your mirror register needs to be updated after every change. Yes. If the interrupt address changes, then we need a callback. | Since the kernel changes the value at runtime, and since a different | register needs to be written to, I can't quite use the generic code I | have written as is. I imagine at least some other drivers would like to know when their interrupt configuration changes, also, so an interface where a driver can register a callback handler seems like the right generic answer, or more simply, a way for a driver to say it doesn't want it's interrupt handler migrated (which we would like anyway, for performance reasons). Dave Olson dave.olson@qlogic.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 19:00 ` Dave Olson @ 2006-07-13 19:20 ` Eric W. Biederman 2006-07-13 19:34 ` Dave Olson 0 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2006-07-13 19:20 UTC (permalink / raw) To: olson; +Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss Dave Olson <olson@pathscale.com> writes: > On Thu, 13 Jul 2006, Eric W. Biederman wrote: > | > There's really nothing special at all about the interrupt > | > setup, except in one very minor way. The value of the HT interrupt > | > destination address needs to be copied from HT config space, to > | > an internal chip register (which is, can, and should be, handled by > | > the driver init code). > | > | The kernel changes the value at runtime, based upon user input. > | I assume your mirror register needs to be updated after every change. > > Yes. If the interrupt address changes, then we need a callback. > > | Since the kernel changes the value at runtime, and since a different > | register needs to be written to, I can't quite use the generic code I > | have written as is. > > I imagine at least some other drivers would like to know when their interrupt > configuration changes, also, so an interface where a driver can register > a callback handler seems like the right generic answer, or more simply, > a way for a driver to say it doesn't want it's interrupt handler > migrated (which we would like anyway, for performance reasons). As I recall that is "killall irqbalanced" Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] Initial generic hypertransport interrupt support. 2006-07-13 19:20 ` Eric W. Biederman @ 2006-07-13 19:34 ` Dave Olson 0 siblings, 0 replies; 22+ messages in thread From: Dave Olson @ 2006-07-13 19:34 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, linux-kernel, Benjamin Herrenschmidt, discuss On Thu, 13 Jul 2006, Eric W. Biederman wrote: | > I imagine at least some other drivers would like to know when their interrupt | > configuration changes, also, so an interface where a driver can register | > a callback handler seems like the right generic answer, or more simply, | > a way for a driver to say it doesn't want it's interrupt handler | > migrated (which we would like anyway, for performance reasons). | | As I recall that is "killall irqbalanced" That's the way for the user or admin to do it. It would be good for a driver to be able to say "I like my interrupt here, please don't move it". It can be advisory, rather than mandatory, but there are often cases where one driver is hurt by something that is generally beneficial, and disabling it everywhere is a big hammer. It works, but it's not ideal. Different issue than you are dealing with, but I thought it was worth bringing up, as long as we are talking about moving interrupts. Dave Olson dave.olson@qlogic.com ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-07-13 19:34 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-10 22:14 [PATCH 1/2] Add Hypertransport capability defines Eric W. Biederman 2006-07-10 22:26 ` [PATCH 2/2] Initial generic hypertransport interrupt support Eric W. Biederman 2006-07-10 22:39 ` Benjamin Herrenschmidt 2006-07-11 3:51 ` Eric W. Biederman 2006-07-11 5:20 ` Benjamin Herrenschmidt 2006-07-11 6:29 ` Eric W. Biederman 2006-07-11 7:29 ` Segher Boessenkool 2006-07-11 7:48 ` Eric W. Biederman 2006-07-11 9:15 ` Benjamin Herrenschmidt 2006-07-11 19:56 ` Eric W. Biederman 2006-07-11 22:18 ` Benjamin Herrenschmidt 2006-07-11 22:27 ` Andi Kleen 2006-07-12 3:05 ` Eric W. Biederman 2006-07-12 6:10 ` Dave Olson 2006-07-12 6:56 ` Eric W. Biederman 2006-07-13 3:56 ` Dave Olson 2006-07-13 15:13 ` Eric W. Biederman 2006-07-13 18:15 ` Dave Olson 2006-07-13 18:41 ` Eric W. Biederman 2006-07-13 19:00 ` Dave Olson 2006-07-13 19:20 ` Eric W. Biederman 2006-07-13 19:34 ` Dave Olson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox