* [PATCH] mlx4: propagate node_description changes down to FW to generate trap 144
@ 2010-10-04 12:11 Jack Morgenstein
[not found] ` <201010041411.34956.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: Jack Morgenstein @ 2010-10-04 12:11 UTC (permalink / raw)
To: rdreier-FYB4Gu1CFyUAvxtiuMwx3w
Cc: Or Gerlitz, Hal Rosenstock, tziporet-VPRAkNaXOzVS1MOuV/RT9w,
yaeli-VPRAkNaXOzVS1MOuV/RT9w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
The Node Description cannot be changed via MADs (it is read-only).
Until now, it was changed in the driver via sysfs, and the new Node
Description was simply inserted by the driver into MAD responses
(replacing the description returned by FW).
Until now, openibd used the sysfs interface to change the node description
at driver startup. However, that generated a race condition, where OpenSM
could get the FW node description rather than the sysfs description if OpenSM
queried the device before openibd had a chance to enter the new description.
The solution is a new FW command (SET_NODE) which allows passing the
new node description to FW. When this command is invoked, FW issues
a 144 trap to OpenSM. Upon receiving this trap, OpenSM can query the
node to obtain the new node description -- thus eliminating the effects
of the race.
This patch simply adds invoking the SET_NODE command when a new node
description is entered via sysfs (thus causing trap 144 to be issued
by the FW).
The patch works whether or not the new FW command is available. If SET_NODE
is not available, things operate as before (i.e., the node description
is changed in the driver, but no trap is issued).
Signed-off-by: Jack Morgenstein <jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
---
Roland,
I left the snooping and overwriting the node description as is.
That way, I do not need to test and remember if SET_NODE succeeded, and/or
I do not need to know which FW version introduced SET_NODE.
Since we need to keep the old method (snooping and overwriting) around anyway,
I think it is easier and simpler to keep things as they are, and only use
SET_NODE to generate trap 144.
Please note that I do not generate any errors if SET_NODE fails for any reason,
(including failure to allocate a mailbox) since the node description has been
successfully entered into the driver and is available.
Jack
Index: infiniband/drivers/infiniband/hw/mlx4/main.c
===================================================================
--- infiniband.orig/drivers/infiniband/hw/mlx4/main.c 2010-10-04 10:54:50.000000000 +0200
+++ infiniband/drivers/infiniband/hw/mlx4/main.c 2010-10-04 12:25:46.000000000 +0200
@@ -272,14 +272,31 @@ out:
static int mlx4_ib_modify_device(struct ib_device *ibdev, int mask,
struct ib_device_modify *props)
{
+ struct mlx4_cmd_mailbox *mailbox;
+
if (mask & ~IB_DEVICE_MODIFY_NODE_DESC)
return -EOPNOTSUPP;
- if (mask & IB_DEVICE_MODIFY_NODE_DESC) {
- spin_lock(&to_mdev(ibdev)->sm_lock);
- memcpy(ibdev->node_desc, props->node_desc, 64);
- spin_unlock(&to_mdev(ibdev)->sm_lock);
- }
+ if (!(mask & IB_DEVICE_MODIFY_NODE_DESC))
+ return 0;
+
+ spin_lock(&to_mdev(ibdev)->sm_lock);
+ memcpy(ibdev->node_desc, props->node_desc, 64);
+ spin_unlock(&to_mdev(ibdev)->sm_lock);
+
+ /* if possible, pass node desc to FW, so it can generate
+ * a 144 trap. If cmd fails, just ignore.
+ */
+ mailbox = mlx4_alloc_cmd_mailbox(to_mdev(ibdev)->dev);
+ if (IS_ERR(mailbox))
+ return 0;
+
+ memset(mailbox->buf, 0, 256);
+ memcpy(mailbox->buf, props->node_desc, 64);
+ mlx4_cmd(to_mdev(ibdev)->dev, mailbox->dma, 1, 0,
+ MLX4_CMD_SET_NODE, MLX4_CMD_TIME_CLASS_A);
+
+ mlx4_free_cmd_mailbox(to_mdev(ibdev)->dev, mailbox);
return 0;
}
Index: infiniband/include/linux/mlx4/cmd.h
===================================================================
--- infiniband.orig/include/linux/mlx4/cmd.h 2010-01-28 09:43:01.000000000 +0200
+++ infiniband/include/linux/mlx4/cmd.h 2010-10-04 11:10:27.000000000 +0200
@@ -57,6 +57,7 @@ enum {
MLX4_CMD_QUERY_PORT = 0x43,
MLX4_CMD_SENSE_PORT = 0x4d,
MLX4_CMD_SET_PORT = 0xc,
+ MLX4_CMD_SET_NODE = 0x5a,
MLX4_CMD_ACCESS_DDR = 0x2e,
MLX4_CMD_MAP_ICM = 0xffa,
MLX4_CMD_UNMAP_ICM = 0xff9,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] mlx4: propagate node_description changes down to FW to generate trap 144
[not found] ` <201010041411.34956.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2010-10-23 20:53 ` Roland Dreier
0 siblings, 0 replies; 2+ messages in thread
From: Roland Dreier @ 2010-10-23 20:53 UTC (permalink / raw)
To: Jack Morgenstein
Cc: Or Gerlitz, Hal Rosenstock, tziporet-VPRAkNaXOzVS1MOuV/RT9w,
yaeli-VPRAkNaXOzVS1MOuV/RT9w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
thanks, applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-10-23 20:53 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-04 12:11 [PATCH] mlx4: propagate node_description changes down to FW to generate trap 144 Jack Morgenstein
[not found] ` <201010041411.34956.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2010-10-23 20:53 ` Roland Dreier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).