* [PATCH 0 of 13] ipath - various fixes and cleanups
@ 2006-04-24 21:22 Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 1 of 13] ipath - fix race with exposing reset file Bryan O'Sullivan
` (12 more replies)
0 siblings, 13 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:22 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 96 bytes --]
Hi, Roland -
Here is a set of bug fixes and cleanups for the ipath driver. Please apply.
<b
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 1 of 13] ipath - fix race with exposing reset file
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
@ 2006-04-24 21:22 ` Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails Bryan O'Sullivan
` (11 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:22 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
We were accidentally exposing the "reset" sysfs file more than once
per device.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 8cc21848a9bb -r 61819d2519e0 drivers/infiniband/hw/ipath/ipath_diag.c
--- a/drivers/infiniband/hw/ipath/ipath_diag.c Wed Apr 19 15:24:35 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_diag.c Wed Apr 19 15:24:36 2006 -0700
@@ -277,12 +277,13 @@ static int ipath_diag_open(struct inode
bail:
spin_unlock_irqrestore(&ipath_devs_lock, flags);
- mutex_unlock(&ipath_mutex);
/* Only expose a way to reset the device if we
make it into diag mode. */
if (ret == 0)
ipath_expose_reset(&dd->pcidev->dev);
+
+ mutex_unlock(&ipath_mutex);
return ret;
}
diff -r 8cc21848a9bb -r 61819d2519e0 drivers/infiniband/hw/ipath/ipath_sysfs.c
--- a/drivers/infiniband/hw/ipath/ipath_sysfs.c Wed Apr 19 15:24:35 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_sysfs.c Wed Apr 19 15:24:36 2006 -0700
@@ -711,10 +711,22 @@ static struct attribute_group dev_attr_g
* enters diag mode. A device reset is quite likely to crash the
* machine entirely, so we don't want to normally make it
* available.
+ *
+ * Called with ipath_mutex held.
*/
int ipath_expose_reset(struct device *dev)
{
- return device_create_file(dev, &dev_attr_reset);
+ static int exposed;
+ int ret;
+
+ if (!exposed) {
+ ret = device_create_file(dev, &dev_attr_reset);
+ exposed = 1;
+ }
+ else
+ ret = 0;
+
+ return ret;
}
int ipath_driver_create_group(struct device_driver *drv)
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 1 of 13] ipath - fix race with exposing reset file Bryan O'Sullivan
@ 2006-04-24 21:22 ` Bryan O'Sullivan
2006-05-01 18:47 ` Roland Dreier
2006-04-24 21:22 ` [PATCH 3 of 13] ipath - iterate over correct number of ports during reset Bryan O'Sullivan
` (10 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:22 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Some systems do not set up 64-bit maps on systems with 2GB or less of
memory installed, so we have to fall back to trying a 32-bit setup.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 61819d2519e0 -r 1906950392f7 drivers/infiniband/hw/ipath/ipath_driver.c
--- a/drivers/infiniband/hw/ipath/ipath_driver.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c Wed Apr 19 15:24:36 2006 -0700
@@ -418,9 +418,19 @@ static int __devinit ipath_init_one(stru
ret = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
if (ret) {
- dev_info(&pdev->dev, "pci_set_dma_mask unit %u "
- "fails: %d\n", dd->ipath_unit, ret);
- goto bail_regions;
+ /*
+ * if the 64 bit setup fails, try 32 bit. Some systems
+ * do not setup 64 bit maps on systems with 2GB or less
+ * memory installed.
+ */
+ ret = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
+ if (ret) {
+ dev_info(&pdev->dev, "pci_set_dma_mask unit %u "
+ "fails: %d\n", dd->ipath_unit, ret);
+ goto bail_regions;
+ }
+ else
+ ipath_dbg("No 64bit DMA mask, used 32 bit mask\n");
}
pci_set_master(pdev);
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 3 of 13] ipath - iterate over correct number of ports during reset
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 1 of 13] ipath - fix race with exposing reset file Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails Bryan O'Sullivan
@ 2006-04-24 21:22 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 4 of 13] ipath - change handling of PIO buffers Bryan O'Sullivan
` (9 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:22 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 1906950392f7 -r 49f2286e0bdc drivers/infiniband/hw/ipath/ipath_driver.c
--- a/drivers/infiniband/hw/ipath/ipath_driver.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c Wed Apr 19 15:24:36 2006 -0700
@@ -1959,7 +1959,7 @@ int ipath_reset_device(int unit)
}
if (dd->ipath_pd)
- for (i = 1; i < dd->ipath_portcnt; i++) {
+ for (i = 1; i < dd->ipath_cfgports; i++) {
if (dd->ipath_pd[i] && dd->ipath_pd[i]->port_cnt) {
ipath_dbg("unit %u port %d is in use "
"(PID %u cmd %s), can't reset\n",
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 4 of 13] ipath - change handling of PIO buffers
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (2 preceding siblings ...)
2006-04-24 21:22 ` [PATCH 3 of 13] ipath - iterate over correct number of ports during reset Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-25 9:32 ` Segher Boessenkool
2006-04-24 21:23 ` [PATCH 5 of 13] ipath - use proper address translation routine Bryan O'Sullivan
` (8 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Different ipath hardware types have different numbers of buffers
available, so we decide on the counts ourselves unless we are specifically
overridden with a module parameter.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 49f2286e0bdc -r 8e724d49e74b drivers/infiniband/hw/ipath/ipath_init_chip.c
--- a/drivers/infiniband/hw/ipath/ipath_init_chip.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c Wed Apr 19 15:24:36 2006 -0700
@@ -53,13 +53,19 @@ MODULE_PARM_DESC(cfgports, "Set max numb
/*
* Number of buffers reserved for driver (layered drivers and SMA
- * send). Reserved at end of buffer list.
+ * send). Reserved at end of buffer list. Initialized based on
+ * number of PIO buffers if not set via module interface.
+ * The problem with this is that it's global, but we'll use different
+ * numbers for different chip types. So the default value is not
+ * very useful. I've redefined it for the 1.3 release so that it's
+ * zero unless set by the user to something else, in which case we
+ * try to respect it.
*/
-static ushort ipath_kpiobufs = 32;
+static ushort ipath_kpiobufs;
static int ipath_set_kpiobufs(const char *val, struct kernel_param *kp);
-module_param_call(kpiobufs, ipath_set_kpiobufs, param_get_uint,
+module_param_call(kpiobufs, ipath_set_kpiobufs, param_get_ushort,
&ipath_kpiobufs, S_IWUSR | S_IRUGO);
MODULE_PARM_DESC(kpiobufs, "Set number of PIO buffers for driver");
@@ -531,8 +537,11 @@ static int init_housekeeping(struct ipat
* Don't clear ipath_flags as 8bit mode was set before
* entering this func. However, we do set the linkstate to
* unknown, so we can watch for a transition.
- */
- dd->ipath_flags |= IPATH_LINKUNK;
+ * PRESENT is set because we want register reads to work,
+ * and the kernel infrastructure saw it in config space;
+ * We clear it if we have failures.
+ */
+ dd->ipath_flags |= IPATH_LINKUNK | IPATH_PRESENT;
dd->ipath_flags &= ~(IPATH_LINKACTIVE | IPATH_LINKARMED |
IPATH_LINKDOWN | IPATH_LINKINIT);
@@ -560,6 +569,7 @@ static int init_housekeeping(struct ipat
|| (dd->ipath_uregbase & 0xffffffff) == 0xffffffff) {
ipath_dev_err(dd, "Register read failures from chip, "
"giving up initialization\n");
+ dd->ipath_flags &= ~IPATH_PRESENT;
ret = -ENODEV;
goto done;
}
@@ -682,16 +692,14 @@ int ipath_init_chip(struct ipath_devdata
*/
dd->ipath_pioavregs = ALIGN(val, sizeof(u64) * BITS_PER_BYTE / 2)
/ (sizeof(u64) * BITS_PER_BYTE / 2);
- if (!ipath_kpiobufs) /* have to have at least 1, for SMA */
- kpiobufs = ipath_kpiobufs = 1;
- else if ((dd->ipath_piobcnt2k + dd->ipath_piobcnt4k) <
- (dd->ipath_cfgports * IPATH_MIN_USER_PORT_BUFCNT)) {
- dev_info(&dd->pcidev->dev, "Too few PIO buffers (%u) "
- "for %u ports to have %u each!\n",
- dd->ipath_piobcnt2k + dd->ipath_piobcnt4k,
- dd->ipath_cfgports, IPATH_MIN_USER_PORT_BUFCNT);
- kpiobufs = 1; /* reserve just the minimum for SMA/ether */
- } else
+ if (ipath_kpiobufs == 0) {
+ /* not set by user, or set explictly to default */
+ if ((dd->ipath_piobcnt2k + dd->ipath_piobcnt4k) > 128)
+ kpiobufs = 32;
+ else
+ kpiobufs = 16;
+ }
+ else
kpiobufs = ipath_kpiobufs;
if (kpiobufs >
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 5 of 13] ipath - use proper address translation routine
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (3 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 4 of 13] ipath - change handling of PIO buffers Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-05-01 18:50 ` Roland Dreier
2006-04-24 21:23 ` [PATCH 6 of 13] ipath - fix verbs registration Bryan O'Sullivan
` (7 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Move away from an obsolete, unportable routine for translating physical
addresses.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 8e724d49e74b -r 1ab168913f0f drivers/infiniband/hw/ipath/ipath_keys.c
--- a/drivers/infiniband/hw/ipath/ipath_keys.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_keys.c Wed Apr 19 15:24:36 2006 -0700
@@ -125,12 +125,12 @@ int ipath_lkey_ok(struct ipath_lkey_tabl
/*
* We use LKEY == zero to mean a physical kmalloc() address.
- * This is a bit of a hack since we rely on dma_map_single()
- * being reversible by calling bus_to_virt().
+ * This is a bit of a hack since we rely on being able to
+ * reverse the mapping by calling phys_to_virt().
*/
if (sge->lkey == 0) {
isge->mr = NULL;
- isge->vaddr = bus_to_virt(sge->addr);
+ isge->vaddr = phys_to_virt(sge->addr);
isge->length = sge->length;
isge->sge_length = sge->length;
ret = 1;
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 6 of 13] ipath - fix verbs registration
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (4 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 5 of 13] ipath - use proper address translation routine Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 7 of 13] ipath - prevent hardware from being accessed during reset Bryan O'Sullivan
` (6 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Remember when the verbs layer unregisters from the lower-level code.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 1ab168913f0f -r 3ff1e5ae1c60 drivers/infiniband/hw/ipath/ipath_layer.c
--- a/drivers/infiniband/hw/ipath/ipath_layer.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_layer.c Wed Apr 19 15:24:36 2006 -0700
@@ -46,13 +46,15 @@
/* Acquire before ipath_devs_lock. */
static DEFINE_MUTEX(ipath_layer_mutex);
+static int ipath_verbs_registered;
+
u16 ipath_layer_rcv_opcode;
+
static int (*layer_intr)(void *, u32);
static int (*layer_rcv)(void *, void *, struct sk_buff *);
static int (*layer_rcv_lid)(void *, void *);
static int (*verbs_piobufavail)(void *);
static void (*verbs_rcv)(void *, void *, void *, u32);
-static int ipath_verbs_registered;
static void *(*layer_add_one)(int, struct ipath_devdata *);
static void (*layer_remove_one)(void *);
@@ -585,6 +587,8 @@ void ipath_verbs_unregister(void)
verbs_piobufavail = NULL;
verbs_rcv = NULL;
verbs_timer_cb = NULL;
+
+ ipath_verbs_registered = 0;
mutex_unlock(&ipath_layer_mutex);
}
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 7 of 13] ipath - prevent hardware from being accessed during reset
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (5 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 6 of 13] ipath - fix verbs registration Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 8 of 13] ipath - fix a number of RC protocol bugs Bryan O'Sullivan
` (5 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
The reset code now turns off the PRESENT flag during a reset, so that
other code won't attempt to access a device that's in mid-reset.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 3ff1e5ae1c60 -r ee2f95e99c27 drivers/infiniband/hw/ipath/ipath_intr.c
--- a/drivers/infiniband/hw/ipath/ipath_intr.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c Wed Apr 19 15:24:36 2006 -0700
@@ -719,11 +719,24 @@ irqreturn_t ipath_intr(int irq, void *da
irqreturn_t ipath_intr(int irq, void *data, struct pt_regs *regs)
{
struct ipath_devdata *dd = data;
- u32 istat = ipath_read_kreg32(dd, dd->ipath_kregs->kr_intstatus);
+ u32 istat;
ipath_err_t estat = 0;
static unsigned unexpected = 0;
irqreturn_t ret;
+ if(!(dd->ipath_flags & IPATH_PRESENT)) {
+ /* this is mostly so we don't try to touch the chip while
+ * it is being reset */
+ /*
+ * This return value is perhaps odd, but we do not want the
+ * interrupt core code to remove our interrupt handler
+ * because we don't appear to be handling an interrupt
+ * during a chip reset.
+ */
+ return IRQ_HANDLED;
+ }
+
+ istat = ipath_read_kreg32(dd, dd->ipath_kregs->kr_intstatus);
if (unlikely(!istat)) {
ipath_stats.sps_nullintr++;
ret = IRQ_NONE; /* not our interrupt, or already handled */
diff -r 3ff1e5ae1c60 -r ee2f95e99c27 drivers/infiniband/hw/ipath/ipath_kernel.h
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h Wed Apr 19 15:24:36 2006 -0700
@@ -731,7 +731,7 @@ static inline u32 ipath_read_ureg32(cons
static inline u32 ipath_read_ureg32(const struct ipath_devdata *dd,
ipath_ureg regno, int port)
{
- if (!dd->ipath_kregbase)
+ if (!dd->ipath_kregbase || !(dd->ipath_flags & IPATH_PRESENT))
return 0;
return readl(regno + (u64 __iomem *)
@@ -762,7 +762,7 @@ static inline u32 ipath_read_kreg32(cons
static inline u32 ipath_read_kreg32(const struct ipath_devdata *dd,
ipath_kreg regno)
{
- if (!dd->ipath_kregbase)
+ if (!dd->ipath_kregbase || !(dd->ipath_flags & IPATH_PRESENT))
return -1;
return readl((u32 __iomem *) & dd->ipath_kregbase[regno]);
}
@@ -770,7 +770,7 @@ static inline u64 ipath_read_kreg64(cons
static inline u64 ipath_read_kreg64(const struct ipath_devdata *dd,
ipath_kreg regno)
{
- if (!dd->ipath_kregbase)
+ if (!dd->ipath_kregbase || !(dd->ipath_flags & IPATH_PRESENT))
return -1;
return readq(&dd->ipath_kregbase[regno]);
@@ -786,7 +786,7 @@ static inline u64 ipath_read_creg(const
static inline u64 ipath_read_creg(const struct ipath_devdata *dd,
ipath_sreg regno)
{
- if (!dd->ipath_kregbase)
+ if (!dd->ipath_kregbase || !(dd->ipath_flags & IPATH_PRESENT))
return 0;
return readq(regno + (u64 __iomem *)
@@ -797,7 +797,7 @@ static inline u32 ipath_read_creg32(cons
static inline u32 ipath_read_creg32(const struct ipath_devdata *dd,
ipath_sreg regno)
{
- if (!dd->ipath_kregbase)
+ if (!dd->ipath_kregbase || !(dd->ipath_flags & IPATH_PRESENT))
return 0;
return readl(regno + (u64 __iomem *)
(dd->ipath_cregbase +
diff -r 3ff1e5ae1c60 -r ee2f95e99c27 drivers/infiniband/hw/ipath/ipath_pe800.c
--- a/drivers/infiniband/hw/ipath/ipath_pe800.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_pe800.c Wed Apr 19 15:24:36 2006 -0700
@@ -972,6 +972,8 @@ static int ipath_setup_pe_reset(struct i
/* Use ERROR so it shows up in logs, etc. */
ipath_dev_err(dd, "Resetting PE-800 unit %u\n",
dd->ipath_unit);
+ /* keep chip from being accessed in a few places */
+ dd->ipath_flags &= ~(IPATH_INITTED|IPATH_PRESENT);
val = dd->ipath_control | INFINIPATH_C_RESET;
ipath_write_kreg(dd, dd->ipath_kregs->kr_control, val);
mb();
@@ -997,6 +999,8 @@ static int ipath_setup_pe_reset(struct i
if ((r = pci_enable_device(dd->pcidev)))
ipath_dev_err(dd, "pci_enable_device failed after "
"reset: %d\n", r);
+ /* whether it worked or not, mark as present, again */
+ dd->ipath_flags |= IPATH_PRESENT;
val = ipath_read_kreg64(dd, dd->ipath_kregs->kr_revision);
if (val == dd->ipath_revision) {
ipath_cdbg(VERBOSE, "Got matching revision "
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 8 of 13] ipath - fix a number of RC protocol bugs
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (6 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 7 of 13] ipath - prevent hardware from being accessed during reset Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-25 7:56 ` Andrew Morton
2006-04-24 21:23 ` [PATCH 9 of 13] ipath - simplify RC send posting Bryan O'Sullivan
` (4 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
This change fixes a number of RC protocol bugs:
1. ipath_init_restart() could be called when the QP is already on the
timeout list, thus triggering a bad BUG_ON.
2. If a RDMA read was received on a QP without remote read access,
the s_lock spin lock was reentered.
3. If a sequence NAK was received for a PSN for the middle of a
pending operation, the code to compute which operation to restart
had a bug so that the wrong opcode/PSN was resent. This caused
the RC connection to go into the error state.
4. If a RC connection was configured for shared receive queues (SRQ),
the limit sequence number was not being handled correctly when
RDMA reads, writes, or atomic operations were performed, thus causing
the RC connection to hang.
Signed-off-by: Ralph Campbell <ralphc@pathscale.com>
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r ee2f95e99c27 -r fafcc38877ad drivers/infiniband/hw/ipath/ipath_rc.c
--- a/drivers/infiniband/hw/ipath/ipath_rc.c Wed Apr 19 15:24:36 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_rc.c Mon Apr 24 14:21:04 2006 -0700
@@ -57,9 +57,8 @@ static void ipath_init_restart(struct ip
qp->s_len = wqe->length - len;
dev = to_idev(qp->ibqp.device);
spin_lock(&dev->pending_lock);
- if (qp->timerwait.next == LIST_POISON1)
- list_add_tail(&qp->timerwait,
- &dev->pending[dev->pending_index]);
+ BUG_ON(qp->timerwait.next != LIST_POISON1);
+ list_add_tail(&qp->timerwait, &dev->pending[dev->pending_index]);
spin_unlock(&dev->pending_lock);
}
@@ -135,7 +134,8 @@ static inline u32 ipath_make_rc_ack(stru
*/
qp->r_state = OP(RDMA_READ_RESPONSE_LAST);
qp->s_ack_state = OP(ACKNOWLEDGE);
- return 0;
+ bth0 = 0;
+ goto bail;
case OP(COMPARE_SWAP):
case OP(FETCH_ADD):
@@ -143,7 +143,7 @@ static inline u32 ipath_make_rc_ack(stru
len = 0;
qp->r_state = OP(SEND_LAST);
qp->s_ack_state = OP(ACKNOWLEDGE);
- bth0 = IB_OPCODE_ATOMIC_ACKNOWLEDGE << 24;
+ bth0 = OP(ATOMIC_ACKNOWLEDGE) << 24;
ohdr->u.at.aeth = ipath_compute_aeth(qp);
ohdr->u.at.atomic_ack_eth = cpu_to_be64(qp->s_ack_atomic);
hwords += sizeof(ohdr->u.at) / 4;
@@ -162,6 +162,7 @@ static inline u32 ipath_make_rc_ack(stru
qp->s_cur_sge = ss;
qp->s_cur_size = len;
+bail:
return bth0;
}
@@ -257,7 +258,7 @@ static inline int ipath_make_rc_req(stru
break;
case IB_WR_RDMA_WRITE:
- if (newreq)
+ if (newreq && qp->s_lsn != (u32) -1)
qp->s_lsn++;
/* FALLTHROUGH */
case IB_WR_RDMA_WRITE_WITH_IMM:
@@ -283,8 +284,7 @@ static inline int ipath_make_rc_req(stru
else {
qp->s_state =
OP(RDMA_WRITE_ONLY_WITH_IMMEDIATE);
- /* Immediate data comes
- * after RETH */
+ /* Immediate data comes after RETH */
ohdr->u.rc.imm_data = wqe->wr.imm_data;
hwords += 1;
if (wqe->wr.send_flags & IB_SEND_SOLICITED)
@@ -304,7 +304,8 @@ static inline int ipath_make_rc_req(stru
qp->s_state = OP(RDMA_READ_REQUEST);
hwords += sizeof(ohdr->u.rc.reth) / 4;
if (newreq) {
- qp->s_lsn++;
+ if (qp->s_lsn != (u32) -1)
+ qp->s_lsn++;
/*
* Adjust s_next_psn to count the
* expected number of responses.
@@ -335,7 +336,8 @@ static inline int ipath_make_rc_req(stru
wqe->wr.wr.atomic.compare_add);
hwords += sizeof(struct ib_atomic_eth) / 4;
if (newreq) {
- qp->s_lsn++;
+ if (qp->s_lsn != (u32) -1)
+ qp->s_lsn++;
wqe->lpsn = wqe->psn;
}
if (++qp->s_cur == qp->s_size)
@@ -355,6 +357,11 @@ static inline int ipath_make_rc_req(stru
bth2 |= qp->s_psn++ & IPS_PSN_MASK;
if ((int)(qp->s_psn - qp->s_next_psn) > 0)
qp->s_next_psn = qp->s_psn;
+ /*
+ * Put the QP on the pending list so lost ACKs will cause
+ * a retry. More than one request can be pending so the
+ * QP may already be on the dev->pending list.
+ */
spin_lock(&dev->pending_lock);
if (qp->timerwait.next == LIST_POISON1)
list_add_tail(&qp->timerwait,
@@ -364,8 +371,8 @@ static inline int ipath_make_rc_req(stru
case OP(RDMA_READ_RESPONSE_FIRST):
/*
- * This case can only happen if a send is restarted. See
- * ipath_restart_rc().
+ * This case can only happen if a send is restarted.
+ * See ipath_restart_rc().
*/
ipath_init_restart(qp, wqe);
/* FALLTHROUGH */
@@ -496,29 +503,37 @@ done:
return 0;
}
-static inline void ipath_make_rc_grh(struct ipath_qp *qp,
- struct ib_global_route *grh,
- u32 nwords)
+/**
+ * ipath_make_rc_grh - construct a GRH header
+ * @dev: a pointer to the ipath device
+ * @hdr: a pointer to the GRH header being constructed
+ * @grh: the global route address to send to
+ * @hwords: the number of 32 bit words of header being sent
+ * @nwords: the number of 32 bit words of data being sent
+ *
+ * Return the size of the header in 32 bit words.
+ */
+static u32 ipath_make_rc_grh(struct ipath_ibdev *dev,
+ struct ib_grh *hdr,
+ struct ib_global_route *grh,
+ u32 hwords,
+ u32 nwords)
{
- struct ipath_ibdev *dev = to_idev(qp->ibqp.device);
-
- /* GRH header size in 32-bit words. */
- qp->s_hdrwords += 10;
- qp->s_hdr.u.l.grh.version_tclass_flow =
+ hdr->version_tclass_flow =
cpu_to_be32((6 << 28) |
(grh->traffic_class << 20) |
grh->flow_label);
- qp->s_hdr.u.l.grh.paylen =
- cpu_to_be16(((qp->s_hdrwords - 12) + nwords +
- SIZE_OF_CRC) << 2);
+ hdr->paylen = cpu_to_be16((hwords - 2 + nwords + SIZE_OF_CRC) << 2);
/* next_hdr is defined by C8-7 in ch. 8.4.1 */
- qp->s_hdr.u.l.grh.next_hdr = 0x1B;
- qp->s_hdr.u.l.grh.hop_limit = grh->hop_limit;
+ hdr->next_hdr = 0x1B;
+ hdr->hop_limit = grh->hop_limit;
/* The SGID is 32-bit aligned. */
- qp->s_hdr.u.l.grh.sgid.global.subnet_prefix = dev->gid_prefix;
- qp->s_hdr.u.l.grh.sgid.global.interface_id =
- ipath_layer_get_guid(dev->dd);
- qp->s_hdr.u.l.grh.dgid = grh->dgid;
+ hdr->sgid.global.subnet_prefix = dev->gid_prefix;
+ hdr->sgid.global.interface_id = ipath_layer_get_guid(dev->dd);
+ hdr->dgid = grh->dgid;
+
+ /* GRH header size in 32-bit words. */
+ return sizeof(struct ib_grh) / sizeof(u32);
}
/**
@@ -569,15 +584,6 @@ again:
* If no PIO bufs are available, return. An interrupt will
* call ipath_ib_piobufavail() when one is available.
*/
- _VERBS_INFO("h %u %p\n", qp->s_hdrwords, &qp->s_hdr);
- _VERBS_INFO("d %u %p %u %p %u %u %u %u\n", qp->s_cur_size,
- qp->s_cur_sge->sg_list,
- qp->s_cur_sge->num_sge,
- qp->s_cur_sge->sge.vaddr,
- qp->s_cur_sge->sge.sge_length,
- qp->s_cur_sge->sge.length,
- qp->s_cur_sge->sge.m,
- qp->s_cur_sge->sge.n);
if (ipath_verbs_send(dev->dd, qp->s_hdrwords,
(u32 *) &qp->s_hdr, qp->s_cur_size,
qp->s_cur_sge)) {
@@ -599,8 +605,16 @@ again:
if (qp->s_ack_state != OP(ACKNOWLEDGE) &&
(bth0 = ipath_make_rc_ack(qp, ohdr, pmtu)) != 0)
bth2 = qp->s_ack_psn++ & IPS_PSN_MASK;
- else if (!ipath_make_rc_req(qp, ohdr, pmtu, &bth0, &bth2))
- goto done;
+ else if (!ipath_make_rc_req(qp, ohdr, pmtu, &bth0, &bth2)) {
+ /*
+ * Clear the busy bit before unlocking to avoid races with
+ * adding new work queue items and then failing to process
+ * them.
+ */
+ clear_bit(IPATH_S_BUSY, &qp->s_flags);
+ spin_unlock_irqrestore(&qp->s_lock, flags);
+ goto bail;
+ }
spin_unlock_irqrestore(&qp->s_lock, flags);
@@ -609,7 +623,9 @@ again:
nwords = (qp->s_cur_size + extra_bytes) >> 2;
lrh0 = IPS_LRH_BTH;
if (unlikely(qp->remote_ah_attr.ah_flags & IB_AH_GRH)) {
- ipath_make_rc_grh(qp, &qp->remote_ah_attr.grh, nwords);
+ qp->s_hdrwords += ipath_make_rc_grh(dev, &qp->s_hdr.u.l.grh,
+ &qp->remote_ah_attr.grh,
+ qp->s_hdrwords, nwords);
lrh0 = IPS_LRH_GRH;
}
lrh0 |= qp->remote_ah_attr.sl << 4;
@@ -627,8 +643,6 @@ again:
/* Check for more work to do. */
goto again;
-done:
- spin_unlock_irqrestore(&qp->s_lock, flags);
clear:
clear_bit(IPATH_S_BUSY, &qp->s_flags);
bail:
@@ -640,32 +654,35 @@ static void send_rc_ack(struct ipath_qp
struct ipath_ibdev *dev = to_idev(qp->ibqp.device);
u16 lrh0;
u32 bth0;
+ u32 hwords;
+ struct ipath_ib_header hdr;
struct ipath_other_headers *ohdr;
/* Construct the header. */
- ohdr = &qp->s_hdr.u.oth;
+ ohdr = &hdr.u.oth;
lrh0 = IPS_LRH_BTH;
/* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4. */
- qp->s_hdrwords = 6;
+ hwords = 6;
if (unlikely(qp->remote_ah_attr.ah_flags & IB_AH_GRH)) {
- ipath_make_rc_grh(qp, &qp->remote_ah_attr.grh, 0);
+ hwords += ipath_make_rc_grh(dev, &hdr.u.l.grh,
+ &qp->remote_ah_attr.grh,
+ hwords, 0);
ohdr = &qp->s_hdr.u.l.oth;
lrh0 = IPS_LRH_GRH;
}
bth0 = ipath_layer_get_pkey(dev->dd, qp->s_pkey_index);
ohdr->u.aeth = ipath_compute_aeth(qp);
if (qp->s_ack_state >= OP(COMPARE_SWAP)) {
- bth0 |= IB_OPCODE_ATOMIC_ACKNOWLEDGE << 24;
+ bth0 |= OP(ATOMIC_ACKNOWLEDGE) << 24;
ohdr->u.at.atomic_ack_eth = cpu_to_be64(qp->s_ack_atomic);
- qp->s_hdrwords += sizeof(ohdr->u.at.atomic_ack_eth) / 4;
- }
- else
+ hwords += sizeof(ohdr->u.at.atomic_ack_eth) / 4;
+ } else
bth0 |= OP(ACKNOWLEDGE) << 24;
lrh0 |= qp->remote_ah_attr.sl << 4;
- qp->s_hdr.lrh[0] = cpu_to_be16(lrh0);
- qp->s_hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
- qp->s_hdr.lrh[2] = cpu_to_be16(qp->s_hdrwords + SIZE_OF_CRC);
- qp->s_hdr.lrh[3] = cpu_to_be16(ipath_layer_get_lid(dev->dd));
+ hdr.lrh[0] = cpu_to_be16(lrh0);
+ hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
+ hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
+ hdr.lrh[3] = cpu_to_be16(ipath_layer_get_lid(dev->dd));
ohdr->bth[0] = cpu_to_be32(bth0);
ohdr->bth[1] = cpu_to_be32(qp->remote_qpn);
ohdr->bth[2] = cpu_to_be32(qp->s_ack_psn & IPS_PSN_MASK);
@@ -673,12 +690,93 @@ static void send_rc_ack(struct ipath_qp
/*
* If we can send the ACK, clear the ACK state.
*/
- if (ipath_verbs_send(dev->dd, qp->s_hdrwords, (u32 *) &qp->s_hdr,
- 0, NULL) == 0) {
+ if (ipath_verbs_send(dev->dd, hwords, (u32 *) &hdr, 0, NULL) == 0) {
qp->s_ack_state = OP(ACKNOWLEDGE);
+ dev->n_unicast_xmit++;
+ } else
dev->n_rc_qacks++;
- dev->n_unicast_xmit++;
- }
+}
+
+/**
+ * reset_psn - reset the QP state to send starting from PSN
+ * @qp: the QP
+ * @psn: the packet sequence number to restart at
+ *
+ * This is called from ipath_rc_rcv() to process an incoming RC ACK
+ * for the given QP.
+ * Called at interrupt level with the QP s_lock held.
+ */
+static void reset_psn(struct ipath_qp *qp, u32 psn)
+{
+ u32 n = qp->s_last;
+ struct ipath_swqe *wqe = get_swqe_ptr(qp, n);
+ u32 opcode;
+
+ qp->s_cur = n;
+
+ /*
+ * If we are starting the request from the beginning,
+ * let the normal send code handle initialization.
+ */
+ if (ipath_cmp24(psn, wqe->psn) <= 0) {
+ qp->s_state = OP(SEND_LAST);
+ goto done;
+ }
+
+ /* Find the work request opcode corresponding to the given PSN. */
+ opcode = wqe->wr.opcode;
+ for (;;) {
+ int diff;
+
+ if (++n == qp->s_size)
+ n = 0;
+ if (n == qp->s_tail)
+ break;
+ wqe = get_swqe_ptr(qp, n);
+ diff = ipath_cmp24(psn, wqe->psn);
+ if (diff < 0)
+ break;
+ qp->s_cur = n;
+ /*
+ * If we are starting the request from the beginning,
+ * let the normal send code handle initialization.
+ */
+ if (diff == 0) {
+ qp->s_state = OP(SEND_LAST);
+ goto done;
+ }
+ opcode = wqe->wr.opcode;
+ }
+
+ /*
+ * Set the state to restart in the middle of a request.
+ * Don't change the s_sge, s_cur_sge, or s_cur_size.
+ * See ipath_do_rc_send().
+ */
+ switch (opcode) {
+ case IB_WR_SEND:
+ case IB_WR_SEND_WITH_IMM:
+ qp->s_state = OP(RDMA_READ_RESPONSE_FIRST);
+ break;
+
+ case IB_WR_RDMA_WRITE:
+ case IB_WR_RDMA_WRITE_WITH_IMM:
+ qp->s_state = OP(RDMA_READ_RESPONSE_LAST);
+ break;
+
+ case IB_WR_RDMA_READ:
+ qp->s_state = OP(RDMA_READ_RESPONSE_MIDDLE);
+ break;
+
+ default:
+ /*
+ * This case shouldn't happen since its only
+ * one PSN per req.
+ */
+ qp->s_state = OP(SEND_LAST);
+ }
+done:
+ qp->s_psn = psn;
}
/**
@@ -693,7 +791,6 @@ void ipath_restart_rc(struct ipath_qp *q
{
struct ipath_swqe *wqe = get_swqe_ptr(qp, qp->s_last);
struct ipath_ibdev *dev;
- u32 n;
/*
* If there are no requests pending, we are done.
@@ -735,130 +832,13 @@ void ipath_restart_rc(struct ipath_qp *q
else
dev->n_rc_resends += (int)qp->s_psn - (int)psn;
- /*
- * If we are starting the request from the beginning, let the normal
- * send code handle initialization.
- */
- qp->s_cur = qp->s_last;
- if (ipath_cmp24(psn, wqe->psn) <= 0) {
- qp->s_state = OP(SEND_LAST);
- qp->s_psn = wqe->psn;
- } else {
- n = qp->s_cur;
- for (;;) {
- if (++n == qp->s_size)
- n = 0;
- if (n == qp->s_tail) {
- if (ipath_cmp24(psn, qp->s_next_psn) >= 0) {
- qp->s_cur = n;
- wqe = get_swqe_ptr(qp, n);
- }
- break;
- }
- wqe = get_swqe_ptr(qp, n);
- if (ipath_cmp24(psn, wqe->psn) < 0)
- break;
- qp->s_cur = n;
- }
- qp->s_psn = psn;
-
- /*
- * Reset the state to restart in the middle of a request.
- * Don't change the s_sge, s_cur_sge, or s_cur_size.
- * See ipath_do_rc_send().
- */
- switch (wqe->wr.opcode) {
- case IB_WR_SEND:
- case IB_WR_SEND_WITH_IMM:
- qp->s_state = OP(RDMA_READ_RESPONSE_FIRST);
- break;
-
- case IB_WR_RDMA_WRITE:
- case IB_WR_RDMA_WRITE_WITH_IMM:
- qp->s_state = OP(RDMA_READ_RESPONSE_LAST);
- break;
-
- case IB_WR_RDMA_READ:
- qp->s_state =
- OP(RDMA_READ_RESPONSE_MIDDLE);
- break;
-
- default:
- /*
- * This case shouldn't happen since its only
- * one PSN per req.
- */
- qp->s_state = OP(SEND_LAST);
- }
- }
+ reset_psn(qp, psn);
done:
tasklet_hi_schedule(&qp->s_task);
bail:
return;
-}
-
-/**
- * reset_psn - reset the QP state to send starting from PSN
- * @qp: the QP
- * @psn: the packet sequence number to restart at
- *
- * This is called from ipath_rc_rcv() to process an incoming RC ACK
- * for the given QP.
- * Called at interrupt level with the QP s_lock held.
- */
-static void reset_psn(struct ipath_qp *qp, u32 psn)
-{
- struct ipath_swqe *wqe;
- u32 n;
-
- n = qp->s_cur;
- wqe = get_swqe_ptr(qp, n);
- for (;;) {
- if (++n == qp->s_size)
- n = 0;
- if (n == qp->s_tail) {
- if (ipath_cmp24(psn, qp->s_next_psn) >= 0) {
- qp->s_cur = n;
- wqe = get_swqe_ptr(qp, n);
- }
- break;
- }
- wqe = get_swqe_ptr(qp, n);
- if (ipath_cmp24(psn, wqe->psn) < 0)
- break;
- qp->s_cur = n;
- }
- qp->s_psn = psn;
-
- /*
- * Set the state to restart in the middle of a
- * request. Don't change the s_sge, s_cur_sge, or
- * s_cur_size. See ipath_do_rc_send().
- */
- switch (wqe->wr.opcode) {
- case IB_WR_SEND:
- case IB_WR_SEND_WITH_IMM:
- qp->s_state = OP(RDMA_READ_RESPONSE_FIRST);
- break;
-
- case IB_WR_RDMA_WRITE:
- case IB_WR_RDMA_WRITE_WITH_IMM:
- qp->s_state = OP(RDMA_READ_RESPONSE_LAST);
- break;
-
- case IB_WR_RDMA_READ:
- qp->s_state = OP(RDMA_READ_RESPONSE_MIDDLE);
- break;
-
- default:
- /*
- * This case shouldn't happen since its only
- * one PSN per req.
- */
- qp->s_state = OP(SEND_LAST);
- }
}
/**
@@ -1011,17 +991,7 @@ static int do_rc_ack(struct ipath_qp *qp
dev->n_rc_resends += (int)qp->s_psn - (int)psn;
- /*
- * If we are starting the request from the beginning, let
- * the normal send code handle initialization.
- */
- qp->s_cur = qp->s_last;
- wqe = get_swqe_ptr(qp, qp->s_cur);
- if (ipath_cmp24(psn, wqe->psn) <= 0) {
- qp->s_state = OP(SEND_LAST);
- qp->s_psn = wqe->psn;
- } else
- reset_psn(qp, psn);
+ reset_psn(qp, psn);
qp->s_rnr_timeout =
ib_ipath_rnr_table[(aeth >> IPS_AETH_CREDIT_SHIFT) &
@@ -1182,33 +1152,34 @@ static inline void ipath_rc_rcv_resp(str
goto ack_done;
}
rdma_read:
- if (unlikely(qp->s_state != OP(RDMA_READ_REQUEST)))
- goto ack_done;
- if (unlikely(tlen != (hdrsize + pmtu + 4)))
- goto ack_done;
- if (unlikely(pmtu >= qp->s_len))
- goto ack_done;
- /* We got a response so update the timeout. */
- if (unlikely(qp->s_last == qp->s_tail ||
- get_swqe_ptr(qp, qp->s_last)->wr.opcode !=
- IB_WR_RDMA_READ))
- goto ack_done;
- spin_lock(&dev->pending_lock);
- if (qp->s_rnr_timeout == 0 &&
- qp->timerwait.next != LIST_POISON1)
- list_move_tail(&qp->timerwait,
- &dev->pending[dev->pending_index]);
- spin_unlock(&dev->pending_lock);
- /*
- * Update the RDMA receive state but do the copy w/o holding the
- * locks and blocking interrupts. XXX Yet another place that
- * affects relaxed RDMA order since we don't want s_sge modified.
- */
- qp->s_len -= pmtu;
- qp->s_last_psn = psn;
- spin_unlock_irqrestore(&qp->s_lock, flags);
- ipath_copy_sge(&qp->s_sge, data, pmtu);
- goto bail;
+ if (unlikely(qp->s_state != OP(RDMA_READ_REQUEST)))
+ goto ack_done;
+ if (unlikely(tlen != (hdrsize + pmtu + 4)))
+ goto ack_done;
+ if (unlikely(pmtu >= qp->s_len))
+ goto ack_done;
+ /* We got a response so update the timeout. */
+ if (unlikely(qp->s_last == qp->s_tail ||
+ get_swqe_ptr(qp, qp->s_last)->wr.opcode !=
+ IB_WR_RDMA_READ))
+ goto ack_done;
+ spin_lock(&dev->pending_lock);
+ if (qp->s_rnr_timeout == 0 &&
+ qp->timerwait.next != LIST_POISON1)
+ list_move_tail(&qp->timerwait,
+ &dev->pending[dev->pending_index]);
+ spin_unlock(&dev->pending_lock);
+ /*
+ * Update the RDMA receive state but do the copy w/o
+ * holding the locks and blocking interrupts.
+ * XXX Yet another place that affects relaxed RDMA order
+ * since we don't want s_sge modified.
+ */
+ qp->s_len -= pmtu;
+ qp->s_last_psn = psn;
+ spin_unlock_irqrestore(&qp->s_lock, flags);
+ ipath_copy_sge(&qp->s_sge, data, pmtu);
+ goto bail;
case OP(RDMA_READ_RESPONSE_LAST):
/* ACKs READ req. */
@@ -1255,9 +1226,12 @@ static inline void ipath_rc_rcv_resp(str
if (do_rc_ack(qp, aeth, psn, OP(RDMA_READ_RESPONSE_LAST))) {
/*
* Change the state so we contimue
- * processing new requests.
+ * processing new requests and wake up the
+ * tasklet if there are posted sends.
*/
qp->s_state = OP(SEND_LAST);
+ if (qp->s_tail != qp->s_head)
+ tasklet_hi_schedule(&qp->s_task);
}
goto ack_done;
}
@@ -1296,6 +1270,8 @@ static inline int ipath_rc_rcv_error(str
{
struct ib_reth *reth;
+ spin_lock(&qp->s_lock);
+
if (diff > 0) {
/*
* Packet sequence error.
@@ -1303,13 +1279,10 @@ static inline int ipath_rc_rcv_error(str
* Don't queue the NAK if a RDMA read, atomic, or
* NAK is pending though.
*/
- spin_lock(&qp->s_lock);
if ((qp->s_ack_state >= OP(RDMA_READ_REQUEST) &&
- qp->s_ack_state != IB_OPCODE_ACKNOWLEDGE) ||
- qp->s_nak_state != 0) {
- spin_unlock(&qp->s_lock);
+ qp->s_ack_state != OP(ACKNOWLEDGE)) ||
+ qp->s_nak_state != 0)
goto done;
- }
qp->s_ack_state = OP(SEND_ONLY);
qp->s_nak_state = IB_NAK_PSN_ERROR;
/* Use the expected PSN. */
@@ -1328,12 +1301,10 @@ static inline int ipath_rc_rcv_error(str
* send the earliest so that RDMA reads can be restarted at
* the requester's expected PSN.
*/
- spin_lock(&qp->s_lock);
- if (qp->s_ack_state != IB_OPCODE_ACKNOWLEDGE &&
+ if (qp->s_ack_state != OP(ACKNOWLEDGE) &&
ipath_cmp24(psn, qp->s_ack_psn) >= 0) {
- if (qp->s_ack_state < IB_OPCODE_RDMA_READ_REQUEST)
+ if (qp->s_ack_state < OP(RDMA_READ_REQUEST))
qp->s_ack_psn = psn;
- spin_unlock(&qp->s_lock);
goto done;
}
switch (opcode) {
@@ -1344,8 +1315,7 @@ static inline int ipath_rc_rcv_error(str
* holding the s_lock.
*/
if (qp->s_ack_state != OP(ACKNOWLEDGE) &&
- qp->s_ack_state >= IB_OPCODE_RDMA_READ_REQUEST) {
- spin_unlock(&qp->s_lock);
+ qp->s_ack_state >= OP(RDMA_READ_REQUEST)) {
dev->n_rdma_dup_busy++;
goto done;
}
@@ -1387,10 +1357,8 @@ static inline int ipath_rc_rcv_error(str
* Check for the PSN of the last atomic operations
* performed and resend the result if found.
*/
- if ((psn & IPS_PSN_MASK) != qp->r_atomic_psn) {
- spin_unlock(&qp->s_lock);
+ if ((psn & IPS_PSN_MASK) != qp->r_atomic_psn)
goto done;
- }
qp->s_ack_atomic = qp->r_atomic_data;
break;
}
@@ -1401,6 +1369,7 @@ resched:
return 0;
done:
+ spin_unlock(&qp->s_lock);
return 1;
}
@@ -1493,22 +1462,23 @@ void ipath_rc_rcv(struct ipath_ibdev *de
opcode == OP(SEND_LAST_WITH_IMMEDIATE))
break;
nack_inv:
- /*
- * A NAK will ACK earlier sends and RDMA writes. Don't queue the
- * NAK if a RDMA read, atomic, or NAK is pending though.
- */
- spin_lock(&qp->s_lock);
- if (qp->s_ack_state >= OP(RDMA_READ_REQUEST) &&
- qp->s_ack_state != IB_OPCODE_ACKNOWLEDGE) {
- spin_unlock(&qp->s_lock);
- goto done;
- }
- /* XXX Flush WQEs */
- qp->state = IB_QPS_ERR;
- qp->s_ack_state = OP(SEND_ONLY);
- qp->s_nak_state = IB_NAK_INVALID_REQUEST;
- qp->s_ack_psn = qp->r_psn;
- goto resched;
+ /*
+ * A NAK will ACK earlier sends and RDMA writes.
+ * Don't queue the NAK if a RDMA read, atomic, or NAK
+ * is pending though.
+ */
+ spin_lock(&qp->s_lock);
+ if (qp->s_ack_state >= OP(RDMA_READ_REQUEST) &&
+ qp->s_ack_state != OP(ACKNOWLEDGE)) {
+ spin_unlock(&qp->s_lock);
+ goto done;
+ }
+ /* XXX Flush WQEs */
+ qp->state = IB_QPS_ERR;
+ qp->s_ack_state = OP(SEND_ONLY);
+ qp->s_nak_state = IB_NAK_INVALID_REQUEST;
+ qp->s_ack_psn = qp->r_psn;
+ goto resched;
case OP(RDMA_WRITE_FIRST):
case OP(RDMA_WRITE_MIDDLE):
@@ -1557,9 +1527,8 @@ void ipath_rc_rcv(struct ipath_ibdev *de
* is pending though.
*/
spin_lock(&qp->s_lock);
- if (qp->s_ack_state >=
- OP(RDMA_READ_REQUEST) &&
- qp->s_ack_state != IB_OPCODE_ACKNOWLEDGE) {
+ if (qp->s_ack_state >= OP(RDMA_READ_REQUEST) &&
+ qp->s_ack_state != OP(ACKNOWLEDGE)) {
spin_unlock(&qp->s_lock);
goto done;
}
@@ -1675,10 +1644,10 @@ void ipath_rc_rcv(struct ipath_ibdev *de
* read, atomic, or NAK is pending though.
*/
spin_lock(&qp->s_lock);
+ nack_acc1:
if (qp->s_ack_state >=
OP(RDMA_READ_REQUEST) &&
- qp->s_ack_state !=
- IB_OPCODE_ACKNOWLEDGE) {
+ qp->s_ack_state != OP(ACKNOWLEDGE)) {
spin_unlock(&qp->s_lock);
goto done;
}
@@ -1716,9 +1685,16 @@ void ipath_rc_rcv(struct ipath_ibdev *de
reth = (struct ib_reth *)data;
data += sizeof(*reth);
}
+ if (unlikely(!(qp->qp_access_flags &
+ IB_ACCESS_REMOTE_READ)))
+ goto nack_acc;
+ /*
+ * Ignore request if we already have an
+ * RDMA read or ATOMIC pending.
+ */
spin_lock(&qp->s_lock);
if (qp->s_ack_state != OP(ACKNOWLEDGE) &&
- qp->s_ack_state >= IB_OPCODE_RDMA_READ_REQUEST) {
+ qp->s_ack_state >= OP(RDMA_READ_REQUEST)) {
spin_unlock(&qp->s_lock);
goto done;
}
@@ -1732,10 +1708,8 @@ void ipath_rc_rcv(struct ipath_ibdev *de
ok = ipath_rkey_ok(dev, &qp->s_rdma_sge,
qp->s_rdma_len, vaddr, rkey,
IB_ACCESS_REMOTE_READ);
- if (unlikely(!ok)) {
- spin_unlock(&qp->s_lock);
- goto nack_acc;
- }
+ if (unlikely(!ok))
+ goto nack_acc1;
/*
* Update the next expected PSN. We add 1 later
* below, so only add the remainder here.
@@ -1750,9 +1724,6 @@ void ipath_rc_rcv(struct ipath_ibdev *de
qp->s_rdma_sge.sge.length = 0;
qp->s_rdma_sge.sge.sge_length = 0;
}
- if (unlikely(!(qp->qp_access_flags &
- IB_ACCESS_REMOTE_READ)))
- goto nack_acc;
/*
* We need to increment the MSN here instead of when we
* finish sending the result since a duplicate request would
@@ -1822,7 +1793,7 @@ void ipath_rc_rcv(struct ipath_ibdev *de
*/
spin_lock(&qp->s_lock);
if (qp->s_ack_state == OP(ACKNOWLEDGE) ||
- qp->s_ack_state < IB_OPCODE_RDMA_READ_REQUEST) {
+ qp->s_ack_state < OP(RDMA_READ_REQUEST)) {
qp->s_ack_state = opcode;
qp->s_nak_state = 0;
qp->s_ack_psn = psn;
@@ -1844,6 +1815,8 @@ resched:
(qp->s_ack_state < IB_OPCODE_RDMA_READ_REQUEST ||
qp->s_ack_state >= IB_OPCODE_COMPARE_SWAP))
send_rc_ack(qp);
+ else
+ dev->n_rc_qacks++;
rdmadone:
spin_unlock(&qp->s_lock);
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 9 of 13] ipath - simplify RC send posting
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (7 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 8 of 13] ipath - fix a number of RC protocol bugs Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 10 of 13] ipath - simplify IB timer usage Bryan O'Sullivan
` (3 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Remove some unnecessarily complicated tests.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r fafcc38877ad -r 4eabd5fc05bb drivers/infiniband/hw/ipath/ipath_ruc.c
--- a/drivers/infiniband/hw/ipath/ipath_ruc.c Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_ruc.c Mon Apr 24 14:21:04 2006 -0700
@@ -531,19 +531,12 @@ int ipath_post_rc_send(struct ipath_qp *
}
wqe->wr.num_sge = j;
qp->s_head = next;
- /*
- * Wake up the send tasklet if the QP is not waiting
- * for an RNR timeout.
- */
- next = qp->s_rnr_timeout;
spin_unlock_irqrestore(&qp->s_lock, flags);
- if (next == 0) {
- if (qp->ibqp.qp_type == IB_QPT_UC)
- ipath_do_uc_send((unsigned long) qp);
- else
- ipath_do_rc_send((unsigned long) qp);
- }
+ if (qp->ibqp.qp_type == IB_QPT_UC)
+ ipath_do_uc_send((unsigned long) qp);
+ else
+ ipath_do_rc_send((unsigned long) qp);
ret = 0;
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 10 of 13] ipath - simplify IB timer usage
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (8 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 9 of 13] ipath - simplify RC send posting Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 11 of 13] ipath - improve sparse annotation Bryan O'Sullivan
` (2 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 4eabd5fc05bb -r 36447eb1f256 drivers/infiniband/hw/ipath/ipath_verbs.c
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c Mon Apr 24 14:21:04 2006 -0700
@@ -449,7 +449,6 @@ static void ipath_ib_timer(void *arg)
{
struct ipath_ibdev *dev = (struct ipath_ibdev *) arg;
struct ipath_qp *resend = NULL;
- struct ipath_qp *rnr = NULL;
struct list_head *last;
struct ipath_qp *qp;
unsigned long flags;
@@ -465,32 +464,18 @@ static void ipath_ib_timer(void *arg)
last = &dev->pending[dev->pending_index];
while (!list_empty(last)) {
qp = list_entry(last->next, struct ipath_qp, timerwait);
- if (last->next == LIST_POISON1 ||
- last->next != &qp->timerwait ||
- qp->timerwait.prev != last) {
- INIT_LIST_HEAD(last);
- } else {
- list_del(&qp->timerwait);
- qp->timerwait.prev = (struct list_head *) resend;
- resend = qp;
- atomic_inc(&qp->refcount);
- }
+ list_del(&qp->timerwait);
+ qp->timer_next = resend;
+ resend = qp;
+ atomic_inc(&qp->refcount);
}
last = &dev->rnrwait;
if (!list_empty(last)) {
qp = list_entry(last->next, struct ipath_qp, timerwait);
if (--qp->s_rnr_timeout == 0) {
do {
- if (last->next == LIST_POISON1 ||
- last->next != &qp->timerwait ||
- qp->timerwait.prev != last) {
- INIT_LIST_HEAD(last);
- break;
- }
list_del(&qp->timerwait);
- qp->timerwait.prev =
- (struct list_head *) rnr;
- rnr = qp;
+ tasklet_hi_schedule(&qp->s_task);
if (list_empty(last))
break;
qp = list_entry(last->next, struct ipath_qp,
@@ -530,8 +515,7 @@ static void ipath_ib_timer(void *arg)
spin_unlock_irqrestore(&dev->pending_lock, flags);
/* XXX What if timer fires again while this is running? */
- for (qp = resend; qp != NULL;
- qp = (struct ipath_qp *) qp->timerwait.prev) {
+ for (qp = resend; qp != NULL; qp = qp->timer_next) {
struct ib_wc wc;
spin_lock_irqsave(&qp->s_lock, flags);
@@ -545,9 +529,6 @@ static void ipath_ib_timer(void *arg)
if (atomic_dec_and_test(&qp->refcount))
wake_up(&qp->wait);
}
- for (qp = rnr; qp != NULL;
- qp = (struct ipath_qp *) qp->timerwait.prev)
- tasklet_hi_schedule(&qp->s_task);
}
/**
@@ -556,9 +537,9 @@ static void ipath_ib_timer(void *arg)
*
* This is called from ipath_intr() at interrupt level when a PIO buffer is
* available after ipath_verbs_send() returned an error that no buffers were
- * available. Return 0 if we consumed all the PIO buffers and we still have
+ * available. Return 1 if we consumed all the PIO buffers and we still have
* QPs waiting for buffers (for now, just do a tasklet_hi_schedule and
- * return one).
+ * return zero).
*/
static int ipath_ib_piobufavail(void *arg)
{
@@ -579,7 +560,7 @@ static int ipath_ib_piobufavail(void *ar
spin_unlock_irqrestore(&dev->pending_lock, flags);
bail:
- return 1;
+ return 0;
}
static int ipath_query_device(struct ib_device *ibdev,
@@ -1159,7 +1140,7 @@ static ssize_t show_stats(struct class_d
len = sprintf(buf,
"RC resends %d\n"
- "RC QACKs %d\n"
+ "RC no QACK %d\n"
"RC ACKs %d\n"
"RC SEQ NAKs %d\n"
"RC RDMA seq %d\n"
diff -r 4eabd5fc05bb -r 36447eb1f256 drivers/infiniband/hw/ipath/ipath_verbs.h
--- a/drivers/infiniband/hw/ipath/ipath_verbs.h Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.h Mon Apr 24 14:21:04 2006 -0700
@@ -282,7 +282,8 @@ struct ipath_srq {
*/
struct ipath_qp {
struct ib_qp ibqp;
- struct ipath_qp *next; /* link list for QPN hash table */
+ struct ipath_qp *next; /* link list for QPN hash table */
+ struct ipath_qp *timer_next; /* link list for ipath_ib_timer() */
struct list_head piowait; /* link for wait PIO buf */
struct list_head timerwait; /* link for waiting for timeouts */
struct ib_ah_attr remote_ah_attr;
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 11 of 13] ipath - improve sparse annotation
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (9 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 10 of 13] ipath - simplify IB timer usage Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 12 of 13] ipath - fix label name in interrupt handler Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 13 of 13] ipath - tidy up white space in a few files Bryan O'Sullivan
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r 36447eb1f256 -r f23abcaaea84 drivers/infiniband/hw/ipath/ips_common.h
--- a/drivers/infiniband/hw/ipath/ips_common.h Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ips_common.h Mon Apr 24 14:21:04 2006 -0700
@@ -95,7 +95,7 @@ struct ether_header {
__u8 seq_num;
__le32 len;
/* MUST be of word size due to PIO write requirements */
- __u32 csum;
+ __le32 csum;
__le16 csum_offset;
__le16 flags;
__u16 first_2_bytes;
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 12 of 13] ipath - fix label name in interrupt handler
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (10 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 11 of 13] ipath - improve sparse annotation Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 13 of 13] ipath - tidy up white space in a few files Bryan O'Sullivan
12 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Names that are the opposite of their intended meanings are not so helpful.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r f23abcaaea84 -r e3f1bfd7ce46 drivers/infiniband/hw/ipath/ipath_intr.c
--- a/drivers/infiniband/hw/ipath/ipath_intr.c Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c Mon Apr 24 14:21:04 2006 -0700
@@ -665,14 +665,14 @@ static void handle_layer_pioavail(struct
ret = __ipath_layer_intr(dd, IPATH_LAYER_INT_SEND_CONTINUE);
if (ret > 0)
- goto clear;
+ goto set;
ret = __ipath_verbs_piobufavail(dd);
if (ret > 0)
- goto clear;
+ goto set;
return;
-clear:
+set:
set_bit(IPATH_S_PIOINTBUFAVAIL, &dd->ipath_sendctrl);
ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
dd->ipath_sendctrl);
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 13 of 13] ipath - tidy up white space in a few files
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
` (11 preceding siblings ...)
2006-04-24 21:23 ` [PATCH 12 of 13] ipath - fix label name in interrupt handler Bryan O'Sullivan
@ 2006-04-24 21:23 ` Bryan O'Sullivan
2006-05-01 19:17 ` Roland Dreier
12 siblings, 1 reply; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-04-24 21:23 UTC (permalink / raw)
To: rdreier; +Cc: openib-general, linux-kernel
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
diff -r e3f1bfd7ce46 -r 895650567032 drivers/infiniband/hw/ipath/ipath_debug.h
--- a/drivers/infiniband/hw/ipath/ipath_debug.h Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_debug.h Mon Apr 24 14:21:04 2006 -0700
@@ -60,11 +60,11 @@
#define __IPATH_KERNEL_SEND 0x2000 /* use kernel mode send */
#define __IPATH_EPKTDBG 0x4000 /* print ethernet packet data */
#define __IPATH_SMADBG 0x8000 /* sma packet debug */
-#define __IPATH_IPATHDBG 0x10000 /* Ethernet (IPATH) general debug on */
-#define __IPATH_IPATHWARN 0x20000 /* Ethernet (IPATH) warnings on */
-#define __IPATH_IPATHERR 0x40000 /* Ethernet (IPATH) errors on */
-#define __IPATH_IPATHPD 0x80000 /* Ethernet (IPATH) packet dump on */
-#define __IPATH_IPATHTABLE 0x100000 /* Ethernet (IPATH) table dump on */
+#define __IPATH_IPATHDBG 0x10000 /* Ethernet (IPATH) gen debug */
+#define __IPATH_IPATHWARN 0x20000 /* Ethernet (IPATH) warnings */
+#define __IPATH_IPATHERR 0x40000 /* Ethernet (IPATH) errors */
+#define __IPATH_IPATHPD 0x80000 /* Ethernet (IPATH) packet dump */
+#define __IPATH_IPATHTABLE 0x100000 /* Ethernet (IPATH) table dump */
#else /* _IPATH_DEBUGGING */
@@ -79,11 +79,12 @@
#define __IPATH_TRSAMPLE 0x0 /* generate trace buffer sample entries */
#define __IPATH_VERBDBG 0x0 /* very verbose debug */
#define __IPATH_PKTDBG 0x0 /* print packet data */
-#define __IPATH_PROCDBG 0x0 /* print process startup (init)/exit messages */
+#define __IPATH_PROCDBG 0x0 /* process startup (init)/exit messages */
/* print mmap/nopage stuff, not using VDBG any more */
#define __IPATH_MMDBG 0x0
#define __IPATH_EPKTDBG 0x0 /* print ethernet packet data */
-#define __IPATH_SMADBG 0x0 /* print process startup (init)/exit messages */#define __IPATH_IPATHDBG 0x0 /* Ethernet (IPATH) table dump on */
+#define __IPATH_SMADBG 0x0 /* process startup (init)/exit messages */
+#define __IPATH_IPATHDBG 0x0 /* Ethernet (IPATH) table dump on */
#define __IPATH_IPATHWARN 0x0 /* Ethernet (IPATH) warnings on */
#define __IPATH_IPATHERR 0x0 /* Ethernet (IPATH) errors on */
#define __IPATH_IPATHPD 0x0 /* Ethernet (IPATH) packet dump on */
diff -r e3f1bfd7ce46 -r 895650567032 drivers/infiniband/hw/ipath/ipath_registers.h
--- a/drivers/infiniband/hw/ipath/ipath_registers.h Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_registers.h Mon Apr 24 14:21:04 2006 -0700
@@ -34,8 +34,9 @@
#define _IPATH_REGISTERS_H
/*
- * This file should only be included by kernel source, and by the diags.
- * It defines the registers, and their contents, for the InfiniPath HT-400 chip
+ * This file should only be included by kernel source, and by the diags. It
+ * defines the registers, and their contents, for the InfiniPath HT-400
+ * chip.
*/
/*
@@ -156,8 +157,10 @@
#define INFINIPATH_IBCC_FLOWCTRLWATERMARK_SHIFT 8
#define INFINIPATH_IBCC_LINKINITCMD_MASK 0x3ULL
#define INFINIPATH_IBCC_LINKINITCMD_DISABLE 1
-#define INFINIPATH_IBCC_LINKINITCMD_POLL 2 /* cycle through TS1/TS2 till OK */
-#define INFINIPATH_IBCC_LINKINITCMD_SLEEP 3 /* wait for TS1, then go on */
+/* cycle through TS1/TS2 till OK */
+#define INFINIPATH_IBCC_LINKINITCMD_POLL 2
+/* wait for TS1, then go on */
+#define INFINIPATH_IBCC_LINKINITCMD_SLEEP 3
#define INFINIPATH_IBCC_LINKINITCMD_SHIFT 16
#define INFINIPATH_IBCC_LINKCMD_MASK 0x3ULL
#define INFINIPATH_IBCC_LINKCMD_INIT 1 /* move to 0x11 */
@@ -182,7 +185,8 @@
#define INFINIPATH_IBCS_LINKSTATE_SHIFT 4
#define INFINIPATH_IBCS_TXREADY 0x40000000
#define INFINIPATH_IBCS_TXCREDITOK 0x80000000
-/* link training states (shift by INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) */
+/* link training states (shift by
+ INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) */
#define INFINIPATH_IBCS_LT_STATE_DISABLED 0x00
#define INFINIPATH_IBCS_LT_STATE_LINKUP 0x01
#define INFINIPATH_IBCS_LT_STATE_POLLACTIVE 0x02
@@ -267,10 +271,12 @@
/* kr_serdesconfig0 bits */
#define INFINIPATH_SERDC0_RESET_MASK 0xfULL /* overal reset bits */
#define INFINIPATH_SERDC0_RESET_PLL 0x10000000ULL /* pll reset */
-#define INFINIPATH_SERDC0_TXIDLE 0xF000ULL /* tx idle enables (per lane) */
-#define INFINIPATH_SERDC0_RXDETECT_EN 0xF0000ULL /* rx detect enables (per lane) */
-#define INFINIPATH_SERDC0_L1PWR_DN 0xF0ULL /* L1 Power down; use with RXDETECT,
- Otherwise not used on IB side */
+/* tx idle enables (per lane) */
+#define INFINIPATH_SERDC0_TXIDLE 0xF000ULL
+/* rx detect enables (per lane) */
+#define INFINIPATH_SERDC0_RXDETECT_EN 0xF0000ULL
+/* L1 Power down; use with RXDETECT, Otherwise not used on IB side */
+#define INFINIPATH_SERDC0_L1PWR_DN 0xF0ULL
/* kr_xgxsconfig bits */
#define INFINIPATH_XGXS_RESET 0x7ULL
@@ -390,12 +396,13 @@ struct ipath_kregs {
ipath_kreg kr_txintmemsize;
ipath_kreg kr_xgxsconfig;
ipath_kreg kr_ibpllcfg;
- /* use these two (and the following N ports) only with ipath_k*_kreg64_port();
- * not *kreg64() */
+ /* use these two (and the following N ports) only with
+ * ipath_k*_kreg64_port(); not *kreg64() */
ipath_kreg kr_rcvhdraddr;
ipath_kreg kr_rcvhdrtailaddr;
- /* remaining registers are not present on all types of infinipath chips */
+ /* remaining registers are not present on all types of infinipath
+ chips */
ipath_kreg kr_rcvpktledcnt;
ipath_kreg kr_pcierbuftestreg0;
ipath_kreg kr_pcierbuftestreg1;
diff -r e3f1bfd7ce46 -r 895650567032 drivers/infiniband/hw/ipath/ipath_ud.c
--- a/drivers/infiniband/hw/ipath/ipath_ud.c Mon Apr 24 14:21:04 2006 -0700
+++ b/drivers/infiniband/hw/ipath/ipath_ud.c Mon Apr 24 14:21:04 2006 -0700
@@ -46,8 +46,10 @@
* This is called from ipath_post_ud_send() to forward a WQE addressed
* to the same HCA.
*/
-static void ipath_ud_loopback(struct ipath_qp *sqp, struct ipath_sge_state *ss,
- u32 length, struct ib_send_wr *wr, struct ib_wc *wc)
+static void ipath_ud_loopback(struct ipath_qp *sqp,
+ struct ipath_sge_state *ss,
+ u32 length, struct ib_send_wr *wr,
+ struct ib_wc *wc)
{
struct ipath_ibdev *dev = to_idev(sqp->ibqp.device);
struct ipath_qp *qp;
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 8 of 13] ipath - fix a number of RC protocol bugs
2006-04-24 21:23 ` [PATCH 8 of 13] ipath - fix a number of RC protocol bugs Bryan O'Sullivan
@ 2006-04-25 7:56 ` Andrew Morton
2006-05-01 17:22 ` Roland Dreier
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Morton @ 2006-04-25 7:56 UTC (permalink / raw)
To: Bryan O'Sullivan; +Cc: rdreier, openib-general, linux-kernel
"Bryan O'Sullivan" <bos@pathscale.com> wrote:
>
> + BUG_ON(qp->timerwait.next != LIST_POISON1);
> + list_add_tail(&qp->timerwait, &dev->pending[dev->pending_index]);
Please don't play around with list_head internals like this - some
speedfreak might legitimately choose to remove the list_head poisoning
debug code, or make it Kconfigurable.
One option would be to always do list_del_init() on this thing, then do
BUG_ON(!list_empty()).
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 4 of 13] ipath - change handling of PIO buffers
2006-04-24 21:23 ` [PATCH 4 of 13] ipath - change handling of PIO buffers Bryan O'Sullivan
@ 2006-04-25 9:32 ` Segher Boessenkool
0 siblings, 0 replies; 37+ messages in thread
From: Segher Boessenkool @ 2006-04-25 9:32 UTC (permalink / raw)
To: Bryan O'Sullivan; +Cc: rdreier, openib-general, linux-kernel
> + * The problem with this is that it's global, but we'll use different
> + * numbers for different chip types. So the default value is not
> + * very useful. I've redefined it for the 1.3 release so that it's
----------------------------------------------^^^
Change this to 2.6.17?
> + * zero unless set by the user to something else, in which case we
> + * try to respect it.
Segher
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 8 of 13] ipath - fix a number of RC protocol bugs
2006-04-25 7:56 ` Andrew Morton
@ 2006-05-01 17:22 ` Roland Dreier
2006-05-01 17:34 ` Bryan O'Sullivan
0 siblings, 1 reply; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 17:22 UTC (permalink / raw)
To: Andrew Morton; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
Andrew> Please don't play around with list_head internals like
Andrew> this - some speedfreak might legitimately choose to remove
Andrew> the list_head poisoning debug code, or make it
Andrew> Kconfigurable.
Bryan, can you fix this up and resend this patch?
Are the other patches independent of this? Should I apply all the
others, or do I need to wait for the fixed version of this one?
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 8 of 13] ipath - fix a number of RC protocol bugs
2006-05-01 17:22 ` Roland Dreier
@ 2006-05-01 17:34 ` Bryan O'Sullivan
0 siblings, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-05-01 17:34 UTC (permalink / raw)
To: Roland Dreier; +Cc: Andrew Morton, openib-general, linux-kernel
On Mon, 2006-05-01 at 10:22 -0700, Roland Dreier wrote:
> Andrew> Please don't play around with list_head internals like
> Andrew> this - some speedfreak might legitimately choose to remove
> Andrew> the list_head poisoning debug code, or make it
> Andrew> Kconfigurable.
>
> Bryan, can you fix this up and resend this patch?
Yep. We already have a fix; I just need to put it in my queue.
> Are the other patches independent of this? Should I apply all the
> others, or do I need to wait for the fixed version of this one?
They're all independent of this, so please fire away.
Thanks,
<b
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-04-24 21:22 ` [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails Bryan O'Sullivan
@ 2006-05-01 18:47 ` Roland Dreier
2006-05-01 19:56 ` Segher Boessenkool
0 siblings, 1 reply; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 18:47 UTC (permalink / raw)
To: Bryan O'Sullivan; +Cc: openib-general, linux-kernel
Bryan> Some systems do not set up 64-bit maps on systems with 2GB
Bryan> or less of memory installed, so we have to fall back to
Bryan> trying a 32-bit setup.
Which systems does this happen on? I'm just curious, because mthca has
err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
if (err) {
dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask.\n");
and I've never had a single report of that warning triggering.
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-04-24 21:23 ` [PATCH 5 of 13] ipath - use proper address translation routine Bryan O'Sullivan
@ 2006-05-01 18:50 ` Roland Dreier
2006-05-01 18:54 ` Arjan van de Ven
2006-05-01 19:03 ` Bryan O'Sullivan
0 siblings, 2 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 18:50 UTC (permalink / raw)
To: Bryan O'Sullivan; +Cc: openib-general, linux-kernel
Bryan> Move away from an obsolete, unportable routine for
Bryan> translating physical addresses.
This change:
> - isge->vaddr = bus_to_virt(sge->addr);
> + isge->vaddr = phys_to_virt(sge->addr);
is really wrong. bus_to_virt() is really what you want, because in
this case the address is a bus address that came from dma_map_xxx().
You're still going to be hosed on systems with IOMMUs for example.
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 18:50 ` Roland Dreier
@ 2006-05-01 18:54 ` Arjan van de Ven
2006-05-01 19:00 ` Roland Dreier
2006-05-01 19:03 ` Bryan O'Sullivan
1 sibling, 1 reply; 37+ messages in thread
From: Arjan van de Ven @ 2006-05-01 18:54 UTC (permalink / raw)
To: Roland Dreier; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
On Mon, 2006-05-01 at 11:50 -0700, Roland Dreier wrote:
> Bryan> Move away from an obsolete, unportable routine for
> Bryan> translating physical addresses.
>
> This change:
>
> > - isge->vaddr = bus_to_virt(sge->addr);
> > + isge->vaddr = phys_to_virt(sge->addr);
>
> is really wrong. bus_to_virt() is really what you want, because in
> this case the address is a bus address that came from dma_map_xxx().
> You're still going to be hosed on systems with IOMMUs for example.
do you really NEED the vaddr?
(most of the time linux drivers don't need it, while other OSes do)
If you really need it you should grab it at dma_map time ...
(and realize that it's not kernel addressable per se ;)
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 18:54 ` Arjan van de Ven
@ 2006-05-01 19:00 ` Roland Dreier
2006-05-01 19:20 ` Arjan van de Ven
2006-05-02 13:35 ` Christoph Hellwig
0 siblings, 2 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 19:00 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
Arjan> do you really NEED the vaddr? (most of the time linux
Arjan> drivers don't need it, while other OSes do) If you really
Arjan> need it you should grab it at dma_map time ... (and
Arjan> realize that it's not kernel addressable per se ;)
Yes, they need some kind of vaddr.
It's kind of a layering problem. The IB stack assumes that IB devices
have a DMA engine that deals with bus addresses. But the ipath driver
has to simulate this by using a memcpy on the CPU to move data to the
PCI device.
I really don't know what the right solution is. Maybe having some way
to override the dma mapping operations so that the ipath driver can
keep the info it needs?
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 18:50 ` Roland Dreier
2006-05-01 18:54 ` Arjan van de Ven
@ 2006-05-01 19:03 ` Bryan O'Sullivan
1 sibling, 0 replies; 37+ messages in thread
From: Bryan O'Sullivan @ 2006-05-01 19:03 UTC (permalink / raw)
To: Roland Dreier; +Cc: openib-general, linux-kernel
On Mon, 2006-05-01 at 11:50 -0700, Roland Dreier wrote:
> Bryan> Move away from an obsolete, unportable routine for
> Bryan> translating physical addresses.
>
> This change:
>
> > - isge->vaddr = bus_to_virt(sge->addr);
> > + isge->vaddr = phys_to_virt(sge->addr);
>
> is really wrong. bus_to_virt() is really what you want, because in
> this case the address is a bus address that came from dma_map_xxx().
Well, bus_to_virt is not portable, so we definitely can't use it. I'll
have to do some thinking about this.
<b
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 13 of 13] ipath - tidy up white space in a few files
2006-04-24 21:23 ` [PATCH 13 of 13] ipath - tidy up white space in a few files Bryan O'Sullivan
@ 2006-05-01 19:17 ` Roland Dreier
0 siblings, 0 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 19:17 UTC (permalink / raw)
To: Bryan O'Sullivan; +Cc: openib-general, linux-kernel
Applied all except 5/13 and 8/13...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 19:00 ` Roland Dreier
@ 2006-05-01 19:20 ` Arjan van de Ven
2006-05-01 19:28 ` Roland Dreier
2006-05-02 13:35 ` Christoph Hellwig
1 sibling, 1 reply; 37+ messages in thread
From: Arjan van de Ven @ 2006-05-01 19:20 UTC (permalink / raw)
To: Roland Dreier; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
On Mon, 2006-05-01 at 12:00 -0700, Roland Dreier wrote:
> Arjan> do you really NEED the vaddr? (most of the time linux
> Arjan> drivers don't need it, while other OSes do) If you really
> Arjan> need it you should grab it at dma_map time ... (and
> Arjan> realize that it's not kernel addressable per se ;)
>
> Yes, they need some kind of vaddr.
>
> It's kind of a layering problem. The IB stack assumes that IB devices
> have a DMA engine that deals with bus addresses. But the ipath driver
> has to simulate this by using a memcpy on the CPU to move data to the
> PCI device.
>
> I really don't know what the right solution is. Maybe having some way
> to override the dma mapping operations so that the ipath driver can
> keep the info it needs?
sounds like you need to redesign your layering ;)
In linux it's common to have the lowest level driver do the mapping
(even when the mid layer will provide the most commonly used helper to
do it for the common case)...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 19:20 ` Arjan van de Ven
@ 2006-05-01 19:28 ` Roland Dreier
0 siblings, 0 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 19:28 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
Arjan> sounds like you need to redesign your layering ;) In linux
Arjan> it's common to have the lowest level driver do the mapping
Arjan> (even when the mid layer will provide the most commonly
Arjan> used helper to do it for the common case)...
It's not that simple of course...
InfiniBand allows RDMA -- _remote_ DMA. So that address might be
something that a protocol sent to the remote host and which is now
showing up for a DMA operation initiated by the remote side. And we
can't very well send a struct page * + offset to the remote side...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-01 18:47 ` Roland Dreier
@ 2006-05-01 19:56 ` Segher Boessenkool
2006-05-01 21:41 ` Roland Dreier
0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2006-05-01 19:56 UTC (permalink / raw)
To: Roland Dreier; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
> Bryan> Some systems do not set up 64-bit maps on systems with 2GB
> Bryan> or less of memory installed, so we have to fall back to
> Bryan> trying a 32-bit setup.
>
> Which systems does this happen on?
PowerPC with U3 or U4 northbridge, i.e. Maple or PowerMac G5 systems.
If the IOMMU (DART) is disabled, we have a 32-bit only DMA mask. The
DART will be disabled by default if there is 2GB or less of memory (as
it isn't needed then).
> I'm just curious, because mthca has
>
> err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
> if (err) {
> dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask.
> \n");
>
> and I've never had a single report of that warning triggering.
That's only because I never used those cards on systems with fewer
than 4GB of memory :-)
Segher
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-01 19:56 ` Segher Boessenkool
@ 2006-05-01 21:41 ` Roland Dreier
2006-05-01 23:13 ` Segher Boessenkool
0 siblings, 1 reply; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 21:41 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
Segher> PowerPC with U3 or U4 northbridge, i.e. Maple or PowerMac
Segher> G5 systems. If the IOMMU (DART) is disabled, we have a
Segher> 32-bit only DMA mask. The DART will be disabled by
Segher> default if there is 2GB or less of memory (as it isn't
Segher> needed then).
OK, thanks. I was not aware of that situation.
However, I suspect that PathScale has a different situation in mind,
considering that their driver isn't even buildable for that platform ;)
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-01 21:41 ` Roland Dreier
@ 2006-05-01 23:13 ` Segher Boessenkool
2006-05-01 23:27 ` [openib-general] " Roland Dreier
0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2006-05-01 23:13 UTC (permalink / raw)
To: Roland Dreier; +Cc: Bryan O'Sullivan, openib-general, linux-kernel
> Segher> PowerPC with U3 or U4 northbridge, i.e. Maple or PowerMac
> Segher> G5 systems. If the IOMMU (DART) is disabled, we have a
> Segher> 32-bit only DMA mask. The DART will be disabled by
> Segher> default if there is 2GB or less of memory (as it isn't
> Segher> needed then).
>
> OK, thanks. I was not aware of that situation.
>
> However, I suspect that PathScale has a different situation in mind,
> considering that their driver isn't even buildable for that
> platform ;)
Well (a previous version of) that patch came from me, draw your own
conclusions :-)
And it builds just fine -- what is the problem you're thinking of?
Segher
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [openib-general] Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-01 23:13 ` Segher Boessenkool
@ 2006-05-01 23:27 ` Roland Dreier
2006-05-02 0:13 ` Segher Boessenkool
0 siblings, 1 reply; 37+ messages in thread
From: Roland Dreier @ 2006-05-01 23:27 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linux-kernel, openib-general
Segher> And it builds just fine -- what is the problem you're
Segher> thinking of?
Well, the ipath driver depends on PCI_MSI, and PCI_MSI depends on
(X86_LOCAL_APIC && X86_IO_APIC) || IA64
So how do you enable the driver?
And what powerpc platform can you use the device on?
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [openib-general] Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-01 23:27 ` [openib-general] " Roland Dreier
@ 2006-05-02 0:13 ` Segher Boessenkool
2006-05-02 0:18 ` Roland Dreier
0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2006-05-02 0:13 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-kernel, openib-general
> Segher> And it builds just fine -- what is the problem you're
> Segher> thinking of?
>
> Well, the ipath driver depends on PCI_MSI, and PCI_MSI depends on
> (X86_LOCAL_APIC && X86_IO_APIC) || IA64
Oh, that. Right. It's about time I get my whole MSI patch set into
shape for submission here, yes.
> So how do you enable the driver?
In a very hackish way right now :-(
> And what powerpc platform can you use the device on?
The latest PowerMac's have hardware support for MSI, to name just
one platform.
Segher
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [openib-general] Re: [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails
2006-05-02 0:13 ` Segher Boessenkool
@ 2006-05-02 0:18 ` Roland Dreier
0 siblings, 0 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-02 0:18 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linux-kernel, openib-general
Segher> Oh, that. Right. It's about time I get my whole MSI
Segher> patch set into shape for submission here, yes.
OK, that explains everything ;)
So the ipath driver with a PCIe device works on a PowerMac G5? Cool.
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-01 19:00 ` Roland Dreier
2006-05-01 19:20 ` Arjan van de Ven
@ 2006-05-02 13:35 ` Christoph Hellwig
2006-05-02 14:24 ` Roland Dreier
1 sibling, 1 reply; 37+ messages in thread
From: Christoph Hellwig @ 2006-05-02 13:35 UTC (permalink / raw)
To: Roland Dreier
Cc: Arjan van de Ven, Bryan O'Sullivan, openib-general,
linux-kernel
On Mon, May 01, 2006 at 12:00:00PM -0700, Roland Dreier wrote:
> Arjan> do you really NEED the vaddr? (most of the time linux
> Arjan> drivers don't need it, while other OSes do) If you really
> Arjan> need it you should grab it at dma_map time ... (and
> Arjan> realize that it's not kernel addressable per se ;)
>
> Yes, they need some kind of vaddr.
>
> It's kind of a layering problem. The IB stack assumes that IB devices
> have a DMA engine that deals with bus addresses. But the ipath driver
> has to simulate this by using a memcpy on the CPU to move data to the
> PCI device.
>
> I really don't know what the right solution is. Maybe having some way
> to override the dma mapping operations so that the ipath driver can
> keep the info it needs?
Or stop doing the dma mapping in the IB upper level drivers. I told you
that we'll get broken hardware that doesn't want dma mapping in the upper
level driver, and pathscale created exactly that :)
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-02 13:35 ` Christoph Hellwig
@ 2006-05-02 14:24 ` Roland Dreier
2006-05-02 14:27 ` Christoph Hellwig
2006-05-02 14:55 ` Alan Cox
0 siblings, 2 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-02 14:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Arjan van de Ven, Bryan O'Sullivan, openib-general,
linux-kernel
Christoph> Or stop doing the dma mapping in the IB upper level
Christoph> drivers. I told you that we'll get broken hardware
Christoph> that doesn't want dma mapping in the upper level
Christoph> driver, and pathscale created exactly that :)
But see my earlier mail to Arjan about RDMA -- what address can a
protocol (eg SRP initiator) put in a message that the other side will
use to initiate a remote DMA operation? It seems to me it has to be a
bus address, and that means that the protocol has to do the DMA mapping.
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-02 14:24 ` Roland Dreier
@ 2006-05-02 14:27 ` Christoph Hellwig
2006-05-02 14:55 ` Alan Cox
1 sibling, 0 replies; 37+ messages in thread
From: Christoph Hellwig @ 2006-05-02 14:27 UTC (permalink / raw)
To: Roland Dreier
Cc: Christoph Hellwig, Arjan van de Ven, Bryan O'Sullivan,
openib-general, linux-kernel
On Tue, May 02, 2006 at 07:24:18AM -0700, Roland Dreier wrote:
> Christoph> Or stop doing the dma mapping in the IB upper level
> Christoph> drivers. I told you that we'll get broken hardware
> Christoph> that doesn't want dma mapping in the upper level
> Christoph> driver, and pathscale created exactly that :)
>
> But see my earlier mail to Arjan about RDMA -- what address can a
> protocol (eg SRP initiator) put in a message that the other side will
> use to initiate a remote DMA operation? It seems to me it has to be a
> bus address, and that means that the protocol has to do the DMA mapping.
Then we're back to the discussion on why RDMA is a fundamentally flawed
approach, but we already knew that. The usual workaround is to only
allow RDMA operations to registered memory windows for which we can use
the normal dma operation. There's also the *dac* pci dma operations that
can avoid iommu overhead if you support 64bit addressing. But for all this
to work dma mapping fundamentally needs to be handled by the low level driver.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-02 14:24 ` Roland Dreier
2006-05-02 14:27 ` Christoph Hellwig
@ 2006-05-02 14:55 ` Alan Cox
2006-05-02 14:58 ` Roland Dreier
1 sibling, 1 reply; 37+ messages in thread
From: Alan Cox @ 2006-05-02 14:55 UTC (permalink / raw)
To: Roland Dreier
Cc: Christoph Hellwig, Arjan van de Ven, Bryan O'Sullivan,
openib-general, linux-kernel
On Maw, 2006-05-02 at 07:24 -0700, Roland Dreier wrote:
> But see my earlier mail to Arjan about RDMA -- what address can a
> protocol (eg SRP initiator) put in a message that the other side will
> use to initiate a remote DMA operation? It seems to me it has to be a
> bus address, and that means that the protocol has to do the DMA mapping.
For most drivers properly, but you are making assumptions again. Why
can't a driver which is doing its own mapping not also do its own rdma
cookie handling ? You opt out of mapping being done for you, then you
get opted out of defaults for other stuff too.
Alan
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 5 of 13] ipath - use proper address translation routine
2006-05-02 14:55 ` Alan Cox
@ 2006-05-02 14:58 ` Roland Dreier
0 siblings, 0 replies; 37+ messages in thread
From: Roland Dreier @ 2006-05-02 14:58 UTC (permalink / raw)
To: Alan Cox
Cc: Christoph Hellwig, Arjan van de Ven, Bryan O'Sullivan,
openib-general, linux-kernel
Alan> For most drivers properly, but you are making assumptions
Alan> again. Why can't a driver which is doing its own mapping not
Alan> also do its own rdma cookie handling ? You opt out of
Alan> mapping being done for you, then you get opted out of
Alan> defaults for other stuff too.
You're right, and that was what I was driving at in my earlier message
when I talked about overriding the dma mapping operations for a
device. That would let ipath or whatever create its own RDMA cookies,
and keep track of the struct page or kernel virtual address of the
original memory, so it can do memcpy when needed.
I don't think the idea lets you push mapping down into the low-level
driver, though. Take the SRP initiator as a specific example. The
SCSI midlayer gives SRP a SCSI command to send. The SRP initiator
formats that into an SRP message, with a "memory descriptor" (address
and RDMA cookie) for the buffer associated with the SCSI command, and
tells the low-level driver to send that message to the target. The
target then performs RDMA into that buffer, sending back only the RDMA
cookie and address.
So unless you teach every low-level driver how to snoop inside SRP
messages (along with NFS/RDMA, iSER and all the other protocols), I
don't see where the low-level driver has a chance to do the mapping.
- R.
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2006-05-02 14:58 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-24 21:22 [PATCH 0 of 13] ipath - various fixes and cleanups Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 1 of 13] ipath - fix race with exposing reset file Bryan O'Sullivan
2006-04-24 21:22 ` [PATCH 2 of 13] ipath - set up 32-bit DMA mask if 64-bit setup fails Bryan O'Sullivan
2006-05-01 18:47 ` Roland Dreier
2006-05-01 19:56 ` Segher Boessenkool
2006-05-01 21:41 ` Roland Dreier
2006-05-01 23:13 ` Segher Boessenkool
2006-05-01 23:27 ` [openib-general] " Roland Dreier
2006-05-02 0:13 ` Segher Boessenkool
2006-05-02 0:18 ` Roland Dreier
2006-04-24 21:22 ` [PATCH 3 of 13] ipath - iterate over correct number of ports during reset Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 4 of 13] ipath - change handling of PIO buffers Bryan O'Sullivan
2006-04-25 9:32 ` Segher Boessenkool
2006-04-24 21:23 ` [PATCH 5 of 13] ipath - use proper address translation routine Bryan O'Sullivan
2006-05-01 18:50 ` Roland Dreier
2006-05-01 18:54 ` Arjan van de Ven
2006-05-01 19:00 ` Roland Dreier
2006-05-01 19:20 ` Arjan van de Ven
2006-05-01 19:28 ` Roland Dreier
2006-05-02 13:35 ` Christoph Hellwig
2006-05-02 14:24 ` Roland Dreier
2006-05-02 14:27 ` Christoph Hellwig
2006-05-02 14:55 ` Alan Cox
2006-05-02 14:58 ` Roland Dreier
2006-05-01 19:03 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 6 of 13] ipath - fix verbs registration Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 7 of 13] ipath - prevent hardware from being accessed during reset Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 8 of 13] ipath - fix a number of RC protocol bugs Bryan O'Sullivan
2006-04-25 7:56 ` Andrew Morton
2006-05-01 17:22 ` Roland Dreier
2006-05-01 17:34 ` Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 9 of 13] ipath - simplify RC send posting Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 10 of 13] ipath - simplify IB timer usage Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 11 of 13] ipath - improve sparse annotation Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 12 of 13] ipath - fix label name in interrupt handler Bryan O'Sullivan
2006-04-24 21:23 ` [PATCH 13 of 13] ipath - tidy up white space in a few files Bryan O'Sullivan
2006-05-01 19:17 ` Roland Dreier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).