* [PATCH] management: adding mad_dump_fields to libibmad
From: Mike Heinz @ 2010-05-06 18:27 UTC (permalink / raw)
To: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]
Sasha asked that I re-submit the patches for perfquery in a slightly different format. This is the first of 3 patches.
This patch adds a function to libibmad that allows the caller to dump a configurable range of MAD attributes. Basically, this provides an external interface to the internal function _dump_fields.
Signed Off: Michael Heinz
---------------- snip --------------
diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h
index 02ef551..0478c2b 100644
--- a/libibmad/include/infiniband/mad.h
+++ b/libibmad/include/infiniband/mad.h
@@ -1031,6 +1031,9 @@ MAD_EXPORT ib_mad_dump_fn
mad_dump_perfcounters_xmt_disc, mad_dump_perfcounters_rcv_err,
mad_dump_portsamples_control;
+MAD_EXPORT void mad_dump_fields(char *buf, int bufsz, void *val, int valsz,
+ int start, int end);
+
MAD_EXPORT int ibdebug;
#if __BYTE_ORDER == __LITTLE_ENDIAN
diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c
index 335e190..cc9c10f 100644
--- a/libibmad/src/dump.c
+++ b/libibmad/src/dump.c
@@ -671,6 +671,11 @@ static int _dump_fields(char *buf, int bufsz, void *data, int start, int end)
return (int)(s - buf);
}
+void mad_dump_fields(char *buf, int bufsz, void *val, int valsz, int start, int end)
+{
+ return _dump_fields(buf, bufsz, val, start, end);
+}
+
void mad_dump_nodedesc(char *buf, int bufsz, void *val, int valsz)
{
strncpy(buf, val, bufsz);
diff --git a/libibmad/src/libibmad.map b/libibmad/src/libibmad.map
index e2d0b05..5778e3e 100644
--- a/libibmad/src/libibmad.map
+++ b/libibmad/src/libibmad.map
@@ -20,6 +20,7 @@ IBMAD_1.3 {
mad_dump_nodedesc;
mad_dump_nodeinfo;
mad_dump_opervls;
+ mad_dump_fields;
mad_dump_perfcounters;
mad_dump_perfcounters_ext;
mad_dump_perfcounters_xmt_sl;
[-- Attachment #2: dump_fields.patch --]
[-- Type: application/octet-stream, Size: 1395 bytes --]
diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h
index 02ef551..0478c2b 100644
--- a/libibmad/include/infiniband/mad.h
+++ b/libibmad/include/infiniband/mad.h
@@ -1031,6 +1031,9 @@ MAD_EXPORT ib_mad_dump_fn
mad_dump_perfcounters_xmt_disc, mad_dump_perfcounters_rcv_err,
mad_dump_portsamples_control;
+MAD_EXPORT void mad_dump_fields(char *buf, int bufsz, void *val, int valsz,
+ int start, int end);
+
MAD_EXPORT int ibdebug;
#if __BYTE_ORDER == __LITTLE_ENDIAN
diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c
index 335e190..cc9c10f 100644
--- a/libibmad/src/dump.c
+++ b/libibmad/src/dump.c
@@ -671,6 +671,11 @@ static int _dump_fields(char *buf, int bufsz, void *data, int start, int end)
return (int)(s - buf);
}
+void mad_dump_fields(char *buf, int bufsz, void *val, int valsz, int start, int end)
+{
+ return _dump_fields(buf, bufsz, val, start, end);
+}
+
void mad_dump_nodedesc(char *buf, int bufsz, void *val, int valsz)
{
strncpy(buf, val, bufsz);
diff --git a/libibmad/src/libibmad.map b/libibmad/src/libibmad.map
index e2d0b05..5778e3e 100644
--- a/libibmad/src/libibmad.map
+++ b/libibmad/src/libibmad.map
@@ -20,6 +20,7 @@ IBMAD_1.3 {
mad_dump_nodedesc;
mad_dump_nodeinfo;
mad_dump_opervls;
+ mad_dump_fields;
mad_dump_perfcounters;
mad_dump_perfcounters_ext;
mad_dump_perfcounters_xmt_sl;
^ permalink raw reply related
* Re: [PATCH 2/3] ib/iser: remove buggy back-pointer setting
From: Mike Christie @ 2010-05-06 17:09 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, linux-rdma
In-Reply-To: <4BE27CB1.5000609-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
On 05/06/2010 03:24 AM, Or Gerlitz wrote:
> Mike Christie wrote:
>> I agree on it being a bug, but do you remember why that was added to iscsi_iser_conn_destroy originally?
>> I later moved it to iser_conn_release in commit b40977d95fb3a1898ace6a7d97e4ed1a33a440a4)
>> but I think Erez had added that null and some checks for it being null for a specific bug.
>> I am not 100% sure. Look in the git logs to make sure. I will check them too when I get some more time.
>
> Mike, I took a look on the git, in commit 87e8df7a273c7c1acb864c90b5253609c44375a6 "Have iSER data transaction object point to iSER conn", Erez added these two chunks,
>
>> @@ -317,6 +317,8 @@ iscsi_iser_conn_destroy(struct iscsi_cls_conn *cls_conn)
>> + if (iser_conn->ib_conn)
>> + iser_conn->ib_conn->iser_conn = NULL;
>
>
>> @@ -571,6 +571,8 @@ void iser_conn_release(struct iser_conn *ib_conn)
>> + if (ib_conn->iser_conn)
>> + ib_conn->iser_conn->ib_conn = NULL;
>
> The problem he was trying to solve was related to the processing of RX/TX buffers flushed by the QP throughout the disconnection flow, so he was shutting down the UP/DOWN pointers.
>
> Later in commit b40977d95 you touched the UP NULL-ing, leaving it in different form. I don't see any specific reason to have the buggy DOWN NULL-ing in iser_conn_release, I believe it doesn't solve any problem and adds a bug, this is what my patch comes to solve, okay?
>
Yeah, sounds good. Thanks for looking that up.
I thought it was for some case where iscsid was confused, but it looks
like I was wrong (I also double checked userspace to see if we had a bug
that cause a need and did not see one) so add my:
Reviewed-by: Mike Christie <michaelc-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [PATCH 2/2] RDMA/nes: add support of iWARP multicast acceleration over IB_QPT_RAW_ETY QP type
From: Walukiewicz, Miroslaw @ 2010-05-06 16:27 UTC (permalink / raw)
To: Steve Wise
Cc: rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <4BE18710.9090304-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Steve Wise wrote:
Is this all just optimizing mcast packets?
The RAW ETH QP API could be used to accelerate sending and receiving any L2 packets. It depends on application and HW setup. We use it for accelerating a multicast traffic.
Steve Wise wrote:
Does this raw qp service share the mac address with the ports being used
by the host stack? Or does each raw qp get its own mac address?
We use a MAC address of the port as a source MAC. The destination MAC is derived from multicast group. In theory, it is possible using other MAC for unicast traffic acceleration, but it is much more complex due to making correct ARP responses and HW possibility to push unicast packets to correct QPs.
Steve Wise wrote:
Do you all have a user mode UDP/IP running on this raw qp?
Yes, we use modified mckey for most tests.
Steve Wise wrote:
If so, does it use its own IP address separate from the host stack or does it share the host's IP address.
Our test application shares an IP address of host interface as a source IP address.
Regards,
Mirek
-----Original Message-----
From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Steve Wise
Sent: Wednesday, May 05, 2010 4:56 PM
To: Walukiewicz, Miroslaw
Cc: rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 2/2] RDMA/nes: add support of iWARP multicast acceleration over IB_QPT_RAW_ETY QP type
> I see here some misunderstanding. Let me explain better how our tramsmit path works.
>
> In our implementation we use normal memory registration path using ibv_reg_mr and we use ibv_post_send() with lkey/vaddr/len.
>
> The implementation of ibv_post_send (nes_post_send in libnes) for RAW QP passes lkey/virtual_addr/len information to kernel using shared page to our device driver (ud_post_send). There is no data copy here and the driver is used only for fast synchronization.
>
> Because our RAW ETH QP must use physical addresses only, ud_post_send() in kernel makes a virtual to physical memory translation and accesses the QP HW for packet transmission. Previously a packet buffer memory was registered and pinned by ibv_reg_mr to provide necessary information for making such translation.
>
>
I see. Thanks!
> The non-bypass post_send/recv channel (using /dev/infiniband/rdma_cm) is shared with all other user-kernel communication and it is quite complex. It is a perfect path for QP/CQ/PD/mem management but for me it is too complex for traffic acceleration.
>
> The user<->kernel path through additional driver, shared page for lkey/vaddr/len passing and SW memory translation in kernel is much more effective.
>
> Maybe it is a good idea to make that API more official after some kind of standarization. Our tests proved that it works. We achieved twice better performance and latency. That way could open the way for adding some non-RDMA devices to devices supported OFED API.
>
>
Sounds good.
Do you have specific perf numbers to share? Is this all just optimizing
mcast packets?
Also:
Does this raw qp service share the mac address with the ports being used
by the host stack? Or does each raw qp get its own mac address?
Do you all have a user mode UDP/IP running on this raw qp? If so, does
it use its own IP address separate from the host stack or does it share
the host's IP address.
Thanks,
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] infiniband-diags/scripts/ibcheckerrs.in: emulate all ports if necessary.
From: Ira Weiny @ 2010-05-06 16:18 UTC (permalink / raw)
To: Sasha Khapyorsky, Robert Woodruff
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Michael Heinz,
Openfabrics-ewg-0P3JtQMG0aQdnm+yROfE0A
In-Reply-To: <20100505194720.6f9d434c.weiny2-i2BcT+NCU+M@public.gmane.org>
Upon thinking about this a bit more, and seeing Mikes patch. I think that the
patch which Mike sent some time ago is a better fix. This will work fine for
ibcheckerrs. However ibcheckerrors will run AllPortSelect and then go on to
query all the ports individually. The patch below will cause a double read
for each port which will kill ibcheckerrors performance on a large cluster.
Sasha, what is the status of Mikes patch?
Ira
On Wed, 5 May 2010 19:47:20 -0700
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>
> From: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> Date: Wed, 5 May 2010 19:49:37 -0700
> Subject: [PATCH] infiniband-diags/scripts/ibcheckerrs.in: emulate all ports if necessary.
>
>
> Signed-off-by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> ---
> infiniband-diags/scripts/ibcheckerrs.in | 2 +-
> infiniband-diags/src/perfquery.c | 6 +++---
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/infiniband-diags/scripts/ibcheckerrs.in b/infiniband-diags/scripts/ibcheckerrs.in
> index 305379a..f4eb451 100644
> --- a/infiniband-diags/scripts/ibcheckerrs.in
> +++ b/infiniband-diags/scripts/ibcheckerrs.in
> @@ -153,7 +153,7 @@ fi
>
> nodename=`$IBPATH/smpquery $ca_info nodedesc $lid | sed -e "s/^Node Description:\.*\(.*\)/\1/"`
>
> -text="`eval $IBPATH/perfquery $ca_info $lid $portnum`"
> +text="`eval $IBPATH/perfquery -a $ca_info $lid $portnum`"
> rv=$?
> if echo "$text" | awk -v mono=$bw -v brief=$brief -F '[.:]*' '
> function blue(s)
> diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c
> index 00ebfff..5d3b606 100644
> --- a/infiniband-diags/src/perfquery.c
> +++ b/infiniband-diags/src/perfquery.c
> @@ -525,11 +525,11 @@ int main(int argc, char **argv)
> /* ClassPortInfo should be supported as part of libibmad */
> memcpy(&cap_mask, pc + 2, sizeof(cap_mask)); /* CapabilityMask */
> cap_mask = ntohs(cap_mask);
> - if (!(cap_mask & 0x100)) { /* bit 8 is AllPortSelect */
> - if (!all_ports && port == ALL_PORTS)
> - IBERROR("AllPortSelect not supported");
> + if (port == ALL_PORTS && !(cap_mask & 0x100)) { /* bit 8 is AllPortSelect */
> if (all_ports)
> all_ports_loop = 1;
> + else
> + IBERROR("AllPortSelect not supported");
> }
>
> if (xmt_sl) {
> --
> 1.5.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>
--
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCHv8 06/11] ipoib: avoid ipoib over IBoE
From: Roland Dreier @ 2010-05-06 16:13 UTC (permalink / raw)
To: Eli Cohen; +Cc: Linux RDMA list, ewg
In-Reply-To: <20100218172420.GG12286@mtls03>
OK, I applied this with just the first chunk.
--
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] mlx4: Fix chunk sg list overflow in mlx4_alloc_icm()
From: Roland Dreier @ 2010-05-06 16:08 UTC (permalink / raw)
To: sebastien dugue; +Cc: Eli Cohen, linux-rdma
In-Reply-To: <20100506080453.43594a4b@frecb007965>
> Yes, some customer got hit by this, which ended up corrupting memory.
I'll tag it for stable kernels too then. Thanks.
--
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH/RFC] cxgb4: Add MAINTAINERS info
From: Roland Dreier @ 2010-05-06 16:07 UTC (permalink / raw)
To: Or Gerlitz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW, dm-ut6Up61K2wZBDgjK7y7TUQ
In-Reply-To: <4BE25A3D.20800-smomgflXvOZWk0Htik3J/w@public.gmane.org>
> not sure who's the butterfly that caused this, but this was somehow
> committed as "CXGB4 ETHERNET DRIVER (CXGB3)" and same goes for the
> IW_ piece
Thanks, I think I committed, saw the problem, fixed it up, sent the RFC,
and then pushed my tree. I fixed it up now. Pretty impressive eagle
eyes to notice that...
--
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [ewg] ibcheckerrors "Port All FAILED" reported
From: Mike Heinz @ 2010-05-06 15:41 UTC (permalink / raw)
To: Ira Weiny, Sasha Khapyorsky
Cc: Woodruff, Robert J,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, EWG,
tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
In-Reply-To: <20100506083455.951377af.weiny2-i2BcT+NCU+M@public.gmane.org>
Yup - I've also sent a note to Sasha what happened to the patch.
-----Original Message-----
From: Ira Weiny [mailto:weiny2-i2BcT+NCU+M@public.gmane.org]
Sent: Thursday, May 06, 2010 11:35 AM
To: Mike Heinz; Sasha Khapyorsky
Cc: Woodruff, Robert J; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; EWG; tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
Subject: Re: [ewg] ibcheckerrors "Port All FAILED" reported
On Thu, 6 May 2010 06:26:55 -0700
Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> Ira,
>
> I'm pretty sure I already fixed this problem. I submitted a patch to Sasha
> back in April.
The tests below are with the current master.
git://git.openfabrics.org/~sashak/management
Ira
>
>
> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Ira Weiny
> Sent: Wednesday, May 05, 2010 9:10 PM
> To: Woodruff, Robert J; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: EWG; tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
> Subject: Re: [ewg] ibcheckerrors "Port All FAILED" reported
>
> Interesting...
>
> I have a switch which does this as well. Tracing through the scripts shows
> that the perfquery command is failing like this.
>
> 14:29:03 > ./perfquery 40 255
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> It seems there is an issue with the CapabilityMask value...
>
> 14:43:32 > ./perfquery 40 255
> cap_mask 0x400 <=== my debug output
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> 14:43:38 > ./saquery CPI 40
> SA ClassPortInfo:
> ...
> Capability mask..........0x2602
> ...
>
> Those don't match because... perfquery has a bug...
>
> perfquery is issuing a PMA query when it should be issuing a SA query. It
> just so happens that on some switches the result of that PMA query indicates
> AllPortSelect is available. Patch to follow.
>
> Ira
>
>
> On Wed, 5 May 2010 13:47:54 -0700
> "Woodruff, Robert J" <robert.j.woodruff-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> >
> > Hi guys,
> >
> > When I run ibcheckerrors on my Mellanox switch,
> > it is reporting that Port all FAILED.
> >
> > From what I can tell, the switch is working fine and
> > I think that this is a bogus error from the program.
> >
> > If this is indeed not a real problem, can the diagnostic
> > be fixed to not report this as an error ?
> >
> >
> > ibcheckerrors -nocolor -v -t 100
> >
> > # Checking Switch: nodeguid 0x0002c902004046a0
> > Node check lid 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port all: FAILED <------------
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 2: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 3: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 8: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 9: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 10: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 17: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 18: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 20: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 25: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 26: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 27: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 28: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 34: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 35: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 36: OK
> >
> > Checking Ca: nodeguid 0x0002c9030002628a
> > Node check lid 14: OK
> > Error check on lid 14 (cstnh-2 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300025e0a
> > Node check lid 12: OK
> > Error check on lid 12 (cstnh-3 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030002615e
> > Node check lid 15: OK
> > Error check on lid 15 (cstnh-4 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e442
> > Node check lid 11: OK
> > Error check on lid 11 (cstnh-8 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44e
> > Node check lid 8: OK
> > Error check on lid 8 (cstnh-11 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3e6
> > Node check lid 2: OK
> > Error check on lid 2 (cstnh-13 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44a
> > Node check lid 18: OK
> > Error check on lid 18 (cstnh-9 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fb4
> > Node check lid 13: OK
> > Error check on lid 13 (cstnh-7 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fbc
> > Node check lid 10: OK
> > Error check on lid 10 (cstnh-1 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3ee
> > Node check lid 9: OK
> > Error check on lid 9 (cstnh-10 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e446
> > Node check lid 4: OK
> > Error check on lid 4 (cstnh-12 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e22e
> > Node check lid 1: OK
> > Error check on lid 1 (cstnh-14 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e43e
> > Node check lid 19: OK
> > Error check on lid 19 (cstnh-15 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000345
> > Node check lid 6: OK
> > Error check on lid 6 (cstnh-5 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000335
> > Node check lid 5: OK
> > Error check on lid 5 (cstnh-6 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300028238
> > Node check lid 3: OK
> > Error check on lid 3 (cst-linux HCA-1) port 1: OK
> >
> > ## Summary: 17 nodes checked, 0 bad nodes found
> > ## 32 ports checked, 0 ports have errors beyond threshold
> > _______________________________________________
> > ewg mailing list
> > ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> > http://**lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> >
>
>
> --
> Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [ewg] ibcheckerrors "Port All FAILED" reported
From: Ira Weiny @ 2010-05-06 15:34 UTC (permalink / raw)
To: Mike Heinz, Sasha Khapyorsky
Cc: Woodruff, Robert J,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, EWG,
tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
In-Reply-To: <4C2744E8AD2982428C5BFE523DF8CDCB49A4740C58-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
On Thu, 6 May 2010 06:26:55 -0700
Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> Ira,
>
> I'm pretty sure I already fixed this problem. I submitted a patch to Sasha
> back in April.
The tests below are with the current master.
git://git.openfabrics.org/~sashak/management
Ira
>
>
> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Ira Weiny
> Sent: Wednesday, May 05, 2010 9:10 PM
> To: Woodruff, Robert J; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: EWG; tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
> Subject: Re: [ewg] ibcheckerrors "Port All FAILED" reported
>
> Interesting...
>
> I have a switch which does this as well. Tracing through the scripts shows
> that the perfquery command is failing like this.
>
> 14:29:03 > ./perfquery 40 255
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> It seems there is an issue with the CapabilityMask value...
>
> 14:43:32 > ./perfquery 40 255
> cap_mask 0x400 <=== my debug output
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> 14:43:38 > ./saquery CPI 40
> SA ClassPortInfo:
> ...
> Capability mask..........0x2602
> ...
>
> Those don't match because... perfquery has a bug...
>
> perfquery is issuing a PMA query when it should be issuing a SA query. It
> just so happens that on some switches the result of that PMA query indicates
> AllPortSelect is available. Patch to follow.
>
> Ira
>
>
> On Wed, 5 May 2010 13:47:54 -0700
> "Woodruff, Robert J" <robert.j.woodruff-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> >
> > Hi guys,
> >
> > When I run ibcheckerrors on my Mellanox switch,
> > it is reporting that Port all FAILED.
> >
> > From what I can tell, the switch is working fine and
> > I think that this is a bogus error from the program.
> >
> > If this is indeed not a real problem, can the diagnostic
> > be fixed to not report this as an error ?
> >
> >
> > ibcheckerrors -nocolor -v -t 100
> >
> > # Checking Switch: nodeguid 0x0002c902004046a0
> > Node check lid 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port all: FAILED <------------
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 2: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 3: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 8: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 9: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 10: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 17: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 18: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 20: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 25: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 26: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 27: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 28: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 34: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 35: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 36: OK
> >
> > Checking Ca: nodeguid 0x0002c9030002628a
> > Node check lid 14: OK
> > Error check on lid 14 (cstnh-2 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300025e0a
> > Node check lid 12: OK
> > Error check on lid 12 (cstnh-3 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030002615e
> > Node check lid 15: OK
> > Error check on lid 15 (cstnh-4 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e442
> > Node check lid 11: OK
> > Error check on lid 11 (cstnh-8 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44e
> > Node check lid 8: OK
> > Error check on lid 8 (cstnh-11 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3e6
> > Node check lid 2: OK
> > Error check on lid 2 (cstnh-13 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44a
> > Node check lid 18: OK
> > Error check on lid 18 (cstnh-9 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fb4
> > Node check lid 13: OK
> > Error check on lid 13 (cstnh-7 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fbc
> > Node check lid 10: OK
> > Error check on lid 10 (cstnh-1 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3ee
> > Node check lid 9: OK
> > Error check on lid 9 (cstnh-10 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e446
> > Node check lid 4: OK
> > Error check on lid 4 (cstnh-12 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e22e
> > Node check lid 1: OK
> > Error check on lid 1 (cstnh-14 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e43e
> > Node check lid 19: OK
> > Error check on lid 19 (cstnh-15 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000345
> > Node check lid 6: OK
> > Error check on lid 6 (cstnh-5 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000335
> > Node check lid 5: OK
> > Error check on lid 5 (cstnh-6 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300028238
> > Node check lid 3: OK
> > Error check on lid 3 (cst-linux HCA-1) port 1: OK
> >
> > ## Summary: 17 nodes checked, 0 bad nodes found
> > ## 32 ports checked, 0 ports have errors beyond threshold
> > _______________________________________________
> > ewg mailing list
> > ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> > http://**lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> >
>
>
> --
> Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [ewg] [PATCHv8 02/11] ib_core: IBoE support only QP1
From: Eli Cohen @ 2010-05-06 14:28 UTC (permalink / raw)
To: Roland Dreier; +Cc: Eli Cohen, Linux RDMA list, ewg
In-Reply-To: <ada1vdqaqpc.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On Wed, May 05, 2010 at 04:12:15PM -0700, Roland Dreier wrote:
> > @@ -795,11 +799,12 @@ static void mcast_add_one(struct ib_device *device)
> > struct mcast_device *dev;
> > struct mcast_port *port;
> > int i;
> > + int count = 0;
> >
> > if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> > return;
> >
> > - dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
> > + dev = kzalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
>
> > @@ -1007,7 +1010,7 @@ static void ib_sa_add_one(struct ib_device *device)
> > e = device->phys_port_cnt;
> > }
> >
> > - sa_dev = kmalloc(sizeof *sa_dev +
> > + sa_dev = kzalloc(sizeof *sa_dev +
>
> Do you happen to remember why you needed these kmalloc -> kzalloc conversions?
I can't remember why. I do have this habbit of prefering kzalloc
over kmalloc because it saves troubles sometimes.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [ewg] ibcheckerrors "Port All FAILED" reported
From: Mike Heinz @ 2010-05-06 13:26 UTC (permalink / raw)
To: Ira Weiny, Woodruff, Robert J,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: EWG, tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
In-Reply-To: <20100505180943.a9bbb74e.weiny2-i2BcT+NCU+M@public.gmane.org>
Ira,
I'm pretty sure I already fixed this problem. I submitted a patch to Sasha back in April.
-----Original Message-----
From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Ira Weiny
Sent: Wednesday, May 05, 2010 9:10 PM
To: Woodruff, Robert J; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: EWG; tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
Subject: Re: [ewg] ibcheckerrors "Port All FAILED" reported
Interesting...
I have a switch which does this as well. Tracing through the scripts shows
that the perfquery command is failing like this.
14:29:03 > ./perfquery 40 255
./perfquery: iberror: failed: AllPortSelect not supported
It seems there is an issue with the CapabilityMask value...
14:43:32 > ./perfquery 40 255
cap_mask 0x400 <=== my debug output
./perfquery: iberror: failed: AllPortSelect not supported
14:43:38 > ./saquery CPI 40
SA ClassPortInfo:
...
Capability mask..........0x2602
...
Those don't match because... perfquery has a bug...
perfquery is issuing a PMA query when it should be issuing a SA query. It
just so happens that on some switches the result of that PMA query indicates
AllPortSelect is available. Patch to follow.
Ira
On Wed, 5 May 2010 13:47:54 -0700
"Woodruff, Robert J" <robert.j.woodruff-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> Hi guys,
>
> When I run ibcheckerrors on my Mellanox switch,
> it is reporting that Port all FAILED.
>
> From what I can tell, the switch is working fine and
> I think that this is a bogus error from the program.
>
> If this is indeed not a real problem, can the diagnostic
> be fixed to not report this as an error ?
>
>
> ibcheckerrors -nocolor -v -t 100
>
> # Checking Switch: nodeguid 0x0002c902004046a0
> Node check lid 7: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port all: FAILED <------------
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 2: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 3: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 7: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 8: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 9: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 10: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 17: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 18: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 20: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 25: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 26: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 27: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 28: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 34: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 35: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 36: OK
>
> Checking Ca: nodeguid 0x0002c9030002628a
> Node check lid 14: OK
> Error check on lid 14 (cstnh-2 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300025e0a
> Node check lid 12: OK
> Error check on lid 12 (cstnh-3 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030002615e
> Node check lid 15: OK
> Error check on lid 15 (cstnh-4 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e442
> Node check lid 11: OK
> Error check on lid 11 (cstnh-8 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e44e
> Node check lid 8: OK
> Error check on lid 8 (cstnh-11 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e3e6
> Node check lid 2: OK
> Error check on lid 2 (cstnh-13 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e44a
> Node check lid 18: OK
> Error check on lid 18 (cstnh-9 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300044fb4
> Node check lid 13: OK
> Error check on lid 13 (cstnh-7 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300044fbc
> Node check lid 10: OK
> Error check on lid 10 (cstnh-1 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e3ee
> Node check lid 9: OK
> Error check on lid 9 (cstnh-10 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e446
> Node check lid 4: OK
> Error check on lid 4 (cstnh-12 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e22e
> Node check lid 1: OK
> Error check on lid 1 (cstnh-14 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e43e
> Node check lid 19: OK
> Error check on lid 19 (cstnh-15 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0090270002000345
> Node check lid 6: OK
> Error check on lid 6 (cstnh-5 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0090270002000335
> Node check lid 5: OK
> Error check on lid 5 (cstnh-6 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300028238
> Node check lid 3: OK
> Error check on lid 3 (cst-linux HCA-1) port 1: OK
>
> ## Summary: 17 nodes checked, 0 bad nodes found
> ## 32 ports checked, 0 ports have errors beyond threshold
> _______________________________________________
> ewg mailing list
> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
--
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch v3] infiniband: ulp/iser, fix error retval in iser_create_ib_conn_res
From: Or Gerlitz @ 2010-05-06 13:22 UTC (permalink / raw)
To: Roland Dreier
Cc: Dan Carpenter, Jiri Slaby, linux-kernel, jirislaby, linux-rdma
In-Reply-To: <ada4oimmd9p.fsf@roland-alpha.cisco.com>
Roland Dreier wrote:
> Or, I don't think we ever fixed this. This patch looks correct to me,
> any problem with merging this for 2.6.35?
Roland, please use V4 below, the patch is okay and would apply before and after
applying the multipathing patches I sent yesterday (same goes for them).
[PATCH V4] ib/iser: fix error flow in iser_create_ib_conn_res
From: Dan Carpenter <error27@gmail.com>
We shouldn't free things here because we free them later.
The call tree looks like this:
iser_connect() ==> initiating the connection establishment
and later
iser_cma_handler() => iser_route_handler() => iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulted in double free.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
V1 fixed unreachable code
V2 noticed that the original code had a double free
V3 Roland Dreier points out that I left a dangling ERR_PTR() in
"ib_conn->fmr_pool" which would be freed later on.
V4 reviewed/enhanced the change-log
---
drivers/infiniband/ulp/iser/iser_verbs.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
Index: linux-2.6.34-rc6/drivers/infiniband/ulp/iser/iser_verbs.c
===================================================================
--- linux-2.6.34-rc6.orig/drivers/infiniband/ulp/iser/iser_verbs.c
+++ linux-2.6.34-rc6/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -163,10 +163,8 @@ static int iser_create_ib_conn_res(struc
device = ib_conn->device;
ib_conn->login_buf = kmalloc(ISER_RX_LOGIN_SIZE, GFP_KERNEL);
- if (!ib_conn->login_buf) {
- goto alloc_err;
- ret = -ENOMEM;
- }
+ if (!ib_conn->login_buf)
+ goto out_err;
ib_conn->login_dma = ib_dma_map_single(ib_conn->device->ib_device,
(void *)ib_conn->login_buf, ISER_RX_LOGIN_SIZE,
@@ -175,10 +173,9 @@ static int iser_create_ib_conn_res(struc
ib_conn->page_vec = kmalloc(sizeof(struct iser_page_vec) +
(sizeof(u64) * (ISCSI_ISER_SG_TABLESIZE +1)),
GFP_KERNEL);
- if (!ib_conn->page_vec) {
- ret = -ENOMEM;
- goto alloc_err;
- }
+ if (!ib_conn->page_vec)
+ goto out_err;
+
ib_conn->page_vec->pages = (u64 *) (ib_conn->page_vec + 1);
params.page_shift = SHIFT_4K;
@@ -198,7 +195,8 @@ static int iser_create_ib_conn_res(struc
ib_conn->fmr_pool = ib_create_fmr_pool(device->pd, ¶ms);
if (IS_ERR(ib_conn->fmr_pool)) {
ret = PTR_ERR(ib_conn->fmr_pool);
- goto fmr_pool_err;
+ ib_conn->fmr_pool = NULL;
+ goto out_err;
}
memset(&init_attr, 0, sizeof init_attr);
@@ -216,7 +214,7 @@ static int iser_create_ib_conn_res(struc
ret = rdma_create_qp(ib_conn->cma_id, device->pd, &init_attr);
if (ret)
- goto qp_err;
+ goto out_err;
ib_conn->qp = ib_conn->cma_id->qp;
iser_err("setting conn %p cma_id %p: fmr_pool %p qp %p\n",
@@ -224,12 +222,7 @@ static int iser_create_ib_conn_res(struc
ib_conn->fmr_pool, ib_conn->cma_id->qp);
return ret;
-qp_err:
- (void)ib_destroy_fmr_pool(ib_conn->fmr_pool);
-fmr_pool_err:
- kfree(ib_conn->page_vec);
- kfree(ib_conn->login_buf);
-alloc_err:
+out_err:
iser_err("unable to alloc mem or create resource, err %d\n", ret);
return ret;
}
^ permalink raw reply
* Re: [PATCHv8 06/11] ipoib: avoid ipoib over IBoE
From: Eli Cohen @ 2010-05-06 11:02 UTC (permalink / raw)
To: Roland Dreier; +Cc: Linux RDMA list, Eli Cohen, ewg
In-Reply-To: <adamxweasrc.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On Wed, May 05, 2010 at 03:27:51PM -0700, Roland Dreier wrote:
> > @@ -1383,6 +1385,9 @@ static void ipoib_remove_one(struct ib_device *device)
> > dev_list = ib_get_client_data(device, &ipoib_client);
> >
> > list_for_each_entry_safe(priv, tmp, dev_list, list) {
> > + if (rdma_port_link_layer(device, priv->port) != IB_LINK_LAYER_INFINIBAND)
> > + continue;
>
> Why do we need this chunk here? How could a netdev get on our list if
> we never create IPoIB interfaces for IBoE ports?
>
Right, this is not necessary and can be removed.
^ permalink raw reply
* Re: [ewg] [PATCHv8 03/11] IB/umad: Enable support only for IB ports
From: Eli Cohen @ 2010-05-06 10:48 UTC (permalink / raw)
To: Roland Dreier; +Cc: Eli Cohen, Linux RDMA list, ewg
In-Reply-To: <adazl0eatj6.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On Wed, May 05, 2010 at 03:11:09PM -0700, Roland Dreier wrote:
> Why do we not allow umad for IBoE ports? I understand there's no QP0
> but why can't userspace use QP1 just like for IB link layer ports?
Currently QP1 is only used by the CM protocol which is implemented in
the kernel.
Since we handle the iboe specific flow in the cma rather than the SA,
there is no need to expose qp1 to userspace.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCHv8 01/11] ib core: Add link layer property to ports
From: Eli Cohen @ 2010-05-06 10:02 UTC (permalink / raw)
To: Roland Dreier; +Cc: Linux RDMA list, Eli Cohen, ewg
In-Reply-To: <adavdb2at4p.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On Wed, May 05, 2010 at 03:19:50PM -0700, Roland Dreier wrote:
> Hi Eli,
>
> I'm hoping to get this IBoE stuff in for 2.6.35. I started an "iboe"
> branch in my tree (similar to the xrc branch I've been carrying for a
> while), and I added this patch in, except I renamed
> rdma_port_link_layer() to rdma_port_get_link_layer(). This seems to
> match rdma_node_get_transport() better.
>
> In any case as I add patches to my branch, you can stop worrying about
> them, which should make keeping this series updated easier.
>
Hi Roland,
I am glad to hear this and will be happy to help.
^ permalink raw reply
* Re: [PATCH/RFC] cxgb4: Add MAINTAINERS info
From: Dimitris Michailidis @ 2010-05-06 8:43 UTC (permalink / raw)
To: Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW
In-Reply-To: <adawrvijhpq.fsf_-_-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
Roland Dreier wrote:
> Hi guys, does this info for cxgb4/iw_cxgb4 (pretty much copied from
> cxgb3, except with Dimitris instead of Divy) look right? If so I'll add
> it to my tree.
Yes, it's fine with me. Thanks.
>
> Thanks,
> Roland
> ---
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7a9ccda..a00231b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1719,6 +1719,20 @@ W: http://www.openfabrics.org
> S: Supported
> F: drivers/infiniband/hw/cxgb3/
>
> +CXGB4 ETHERNET DRIVER (CXGB4)
> +M: Dimitris Michailidis <dm-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
> +L: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> +W: http://www.chelsio.com
> +S: Supported
> +F: drivers/net/cxgb4/
> +
> +CXGB4 IWARP RNIC DRIVER (IW_CXGB4)
> +M: Steve Wise <swise-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
> +L: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> +W: http://www.openfabrics.org
> +S: Supported
> +F: drivers/infiniband/hw/cxgb4/
> +
> CYBERPRO FB DRIVER
> M: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>
> L: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org (moderated for non-subscribers)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 2/3] ib/iser: remove buggy back-pointer setting
From: Or Gerlitz @ 2010-05-06 8:24 UTC (permalink / raw)
To: Mike Christie; +Cc: Roland Dreier, linux-rdma
In-Reply-To: <4BE1AFF4.30600-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
Mike Christie wrote:
> I agree on it being a bug, but do you remember why that was added to iscsi_iser_conn_destroy originally?
> I later moved it to iser_conn_release in commit b40977d95fb3a1898ace6a7d97e4ed1a33a440a4)
> but I think Erez had added that null and some checks for it being null for a specific bug.
> I am not 100% sure. Look in the git logs to make sure. I will check them too when I get some more time.
Mike, I took a look on the git, in commit 87e8df7a273c7c1acb864c90b5253609c44375a6 "Have iSER data transaction object point to iSER conn", Erez added these two chunks,
> @@ -317,6 +317,8 @@ iscsi_iser_conn_destroy(struct iscsi_cls_conn *cls_conn)
> + if (iser_conn->ib_conn)
> + iser_conn->ib_conn->iser_conn = NULL;
> @@ -571,6 +571,8 @@ void iser_conn_release(struct iser_conn *ib_conn)
> + if (ib_conn->iser_conn)
> + ib_conn->iser_conn->ib_conn = NULL;
The problem he was trying to solve was related to the processing of RX/TX buffers flushed by the QP throughout the disconnection flow, so he was shutting down the UP/DOWN pointers.
Later in commit b40977d95 you touched the UP NULL-ing, leaving it in different form. I don't see any specific reason to have the buggy DOWN NULL-ing in iser_conn_release, I believe it doesn't solve any problem and adds a bug, this is what my patch comes to solve, okay?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [ewg] [PATCH] mlx4_core: request MSIX vectors as much as there CPU cores
From: Eli Cohen @ 2010-05-06 7:49 UTC (permalink / raw)
To: Roland Dreier
Cc: tziporet-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb, Jason Gunthorpe,
Linux RDMA list, Eli Cohen, ewg, yevgenyp-VPRAkNaXOzVS1MOuV/RT9w
In-Reply-To: <ada6332jf79.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On Wed, May 05, 2010 at 12:55:54PM -0700, Roland Dreier wrote:
> > We found it in performance work of our EN (10G) driver
>
> By the way, it would certainly make sense for the ethernet driver to use
> a number of queues that matches num_online_cpus() at the time the
> interface is brought up. Since we can't change the # of MSI-X vectors
> very easily I think we need to allow for the possible CPUs, but bouncing
> a net interface seems lighter weight to me.
>
> Although perhaps reloading a driver on CPU hotplug is OK too?
>
Yes, we have a system where num_possible_cpus is 32 and
num_online_cpus is 16. It's a RH5.4 and the kernel has no problem
allocating 33 MSI-X vectors. The point is that using more than one EQ
per CPU core does not buy us anything; in fact it can contiribute to a
higher rate of interrupts since the same EQ serves less CQs and the
chances for coalescing EQEs are lower.
So what do you think about the following patch to mlx4_en:
diff --git a/drivers/net/mlx4/en_cq.c b/drivers/net/mlx4/en_cq.c
index 21786ad..07c0779 100644
--- a/drivers/net/mlx4/en_cq.c
+++ b/drivers/net/mlx4/en_cq.c
@@ -49,11 +49,12 @@ int mlx4_en_create_cq(struct mlx4_en_priv *priv,
{
struct mlx4_en_dev *mdev = priv->mdev;
int err;
+ int num_active_vectors = min_t(int, num_online_cpus(), mdev->dev->caps.num_comp_vectors);
cq->size = entries;
if (mode == RX) {
cq->buf_size = cq->size * sizeof(struct mlx4_cqe);
- cq->vector = ring % mdev->dev->caps.num_comp_vectors;
+ cq->vector = ring % num_active_vectors;
} else {
cq->buf_size = sizeof(struct mlx4_cqe);
cq->vector = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [RFC] libibverbs: ibv_fork_init() and libhugetlbfs
From: Alexander Schmidt @ 2010-05-06 7:39 UTC (permalink / raw)
To: Roland Dreier, of-ewg, Linux RDMA
Cc: Hoang-Nam Nguyen, Stefan Roscher, Joachim Fenkes,
Christoph Raisch
Hi all,
we are trying to make use of libhugetlbfs in an application that relies on
ibv_fork_init() to enable fork() support. The problem we are running into is
that calls to the madvise system call fail when registering a memory region
for memory that is provided by libhugetlbfs. We have written a preliminary
fix (see below) for this and are looking for comments / feedback to get an
acceptable solution.
When fork support is enabled in libibverbs, madvise() is called for every
memory page that is registered as a memory region. Memory ranges that
are passed to madvise() must be page aligned and the size must be a
multiple of the page size. libibverbs uses sysconf(_SC_PAGESIZE) to find
out the system page size and rounds all ranges passed to reg_mr() according
to this page size. When memory from libhugetlbfs is passed to reg_mr(), this
does not work as the page size for this memory range might be different
(e.g. 16Mb). So libibverbs would have to use the huge page size to
calculate a page aligned range for madvise.
As huge pages are provided to the application "under the hood" when
preloading libhugetlbfs, the application does not have any knowledge about
when it registers a huge page or a usual page.
The patch below demonstrates a possible solution for this. It parses the
/proc/PID/maps file when registering a memory region and decides if the
memory that is to be registered is part of a libhugetlbfs range or not. If so,
a page size of 16Mb is used to align the memory range passed to madvise().
We see two problems with this: it is not a very elegant solution to parse the
procfs file and the 16Mb are hardcoded currently. The latter point could be
solved by calling gethugepagesize() from libhugetlbfs, which would add a new
dependency to libibverbs.
We are highly interested in reviews, comments, suggestions to get this solved
soon. Thanks!
Signed-off-by: Alexander Schmidt <alexs-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
src/memory.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 47 insertions(+), 3 deletions(-)
--- libibverbs-1.1.2.orig/src/memory.c
+++ libibverbs-1.1.2/src/memory.c
@@ -40,6 +40,8 @@
#include <unistd.h>
#include <stdlib.h>
#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
#include "ibverbs.h"
@@ -54,6 +56,8 @@
#define MADV_DOFORK 11
#endif
+#define HUGE_PAGE_SIZE (16 * 1024 * 1024)
+
struct ibv_mem_node {
enum {
IBV_RED,
@@ -446,6 +450,48 @@ static struct ibv_mem_node *__mm_find_st
return node;
}
+static void get_range_address(uintptr_t *start, uintptr_t *end, void *base, size_t size)
+{
+ pid_t pid;
+ FILE *file;
+ char buf[1024], lib[128];
+ int range_page_size = page_size;
+
+ pid = getpid();
+ snprintf(buf, sizeof(buf), "/proc/%d/maps", pid);
+
+ file = fopen(buf, "r");
+ if (!file)
+ goto out;
+
+ while (fgets(buf, sizeof(buf), file) != NULL) {
+ int n;
+ char *substr;
+ uintptr_t range_start, range_end;
+
+ n = sscanf(buf, "%lx-%lx %*s %*x %*s %*u %127s",
+ &range_start, &range_end, &lib);
+
+ if (n < 3)
+ continue;
+
+ substr = strstr(lib, "libhugetlbfs");
+ if (substr) {
+ if ((uintptr_t) base >= range_start &&
+ (uintptr_t) base < range_end) {
+ range_page_size = HUGE_PAGE_SIZE;
+ break;
+ }
+ }
+ }
+ fclose(file);
+
+out:
+ *start = (uintptr_t) base & ~(range_page_size - 1);
+ *end = ((uintptr_t) (base + size + range_page_size - 1) &
+ ~(range_page_size - 1)) - 1;
+}
+
static int ibv_madvise_range(void *base, size_t size, int advice)
{
uintptr_t start, end;
@@ -458,9 +504,7 @@ static int ibv_madvise_range(void *base,
inc = advice == MADV_DONTFORK ? 1 : -1;
- start = (uintptr_t) base & ~(page_size - 1);
- end = ((uintptr_t) (base + size + page_size - 1) &
- ~(page_size - 1)) - 1;
+ get_range_address(&start, &end, base, size);
pthread_mutex_lock(&mm_mutex);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] mlx4: Fix chunk sg list overflow in mlx4_alloc_icm()
From: sebastien dugue @ 2010-05-06 6:04 UTC (permalink / raw)
To: Roland Dreier; +Cc: Eli Cohen, linux-rdma
In-Reply-To: <adatyqmmej0.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
Hi Roland,
On Wed, 05 May 2010 10:42:11 -0700
Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> wrote:
> good catch, applied. Did this hit you in practice? I guess it would
> take a big coherent ICM allocation, were you getting those?
Yes, some customer got hit by this, which ended up corrupting memory.
>
> Also what do you think of this independent cleanup on top of your patch?
> It handles the error case for allocation and then avoids having the
> common case inside a deeper nested block:
No problem, it indeed makes the code more readable.
Thanks,
Sebastien.
>
> drivers/net/mlx4/icm.c | 37 +++++++++++++++++++------------------
> 1 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/net/mlx4/icm.c b/drivers/net/mlx4/icm.c
> index ef62f17..b07e4de 100644
> --- a/drivers/net/mlx4/icm.c
> +++ b/drivers/net/mlx4/icm.c
> @@ -163,29 +163,30 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
> ret = mlx4_alloc_icm_pages(&chunk->mem[chunk->npages],
> cur_order, gfp_mask);
>
> - if (!ret) {
> - ++chunk->npages;
> -
> - if (coherent)
> - ++chunk->nsg;
> - else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
> - chunk->nsg = pci_map_sg(dev->pdev, chunk->mem,
> - chunk->npages,
> - PCI_DMA_BIDIRECTIONAL);
> -
> - if (chunk->nsg <= 0)
> - goto fail;
> - }
> + if (ret) {
> + if (--cur_order < 0)
> + goto fail;
> + else
> + continue;
> + }
>
> - if (chunk->npages == MLX4_ICM_CHUNK_LEN)
> - chunk = NULL;
> + ++chunk->npages;
>
> - npages -= 1 << cur_order;
> - } else {
> - --cur_order;
> - if (cur_order < 0)
> + if (coherent)
> + ++chunk->nsg;
> + else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
> + chunk->nsg = pci_map_sg(dev->pdev, chunk->mem,
> + chunk->npages,
> + PCI_DMA_BIDIRECTIONAL);
> +
> + if (chunk->nsg <= 0)
> goto fail;
> }
> +
> + if (chunk->npages == MLX4_ICM_CHUNK_LEN)
> + chunk = NULL;
> +
> + npages -= 1 << cur_order;
> }
>
> if (!coherent && chunk) {
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH/RFC] cxgb4: Add MAINTAINERS info
From: Or Gerlitz @ 2010-05-06 5:57 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma, netdev, swise, dm
In-Reply-To: <adawrvijhpq.fsf_-_@roland-alpha.cisco.com>
Roland Dreier wrote:
> +CXGB4 ETHERNET DRIVER (CXGB4)
>
not sure who's the butterfly that caused this, but this was somehow
committed as "CXGB4 ETHERNET DRIVER (CXGB3)" and same goes for the IW_
piece
Or.
^ permalink raw reply
* [PATCH] infiniband-diags/scripts/ibcheckerrs.in: emulate all ports if necessary.
From: Ira Weiny @ 2010-05-06 2:47 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
From: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
Date: Wed, 5 May 2010 19:49:37 -0700
Subject: [PATCH] infiniband-diags/scripts/ibcheckerrs.in: emulate all ports if necessary.
Signed-off-by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
---
infiniband-diags/scripts/ibcheckerrs.in | 2 +-
infiniband-diags/src/perfquery.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/infiniband-diags/scripts/ibcheckerrs.in b/infiniband-diags/scripts/ibcheckerrs.in
index 305379a..f4eb451 100644
--- a/infiniband-diags/scripts/ibcheckerrs.in
+++ b/infiniband-diags/scripts/ibcheckerrs.in
@@ -153,7 +153,7 @@ fi
nodename=`$IBPATH/smpquery $ca_info nodedesc $lid | sed -e "s/^Node Description:\.*\(.*\)/\1/"`
-text="`eval $IBPATH/perfquery $ca_info $lid $portnum`"
+text="`eval $IBPATH/perfquery -a $ca_info $lid $portnum`"
rv=$?
if echo "$text" | awk -v mono=$bw -v brief=$brief -F '[.:]*' '
function blue(s)
diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c
index 00ebfff..5d3b606 100644
--- a/infiniband-diags/src/perfquery.c
+++ b/infiniband-diags/src/perfquery.c
@@ -525,11 +525,11 @@ int main(int argc, char **argv)
/* ClassPortInfo should be supported as part of libibmad */
memcpy(&cap_mask, pc + 2, sizeof(cap_mask)); /* CapabilityMask */
cap_mask = ntohs(cap_mask);
- if (!(cap_mask & 0x100)) { /* bit 8 is AllPortSelect */
- if (!all_ports && port == ALL_PORTS)
- IBERROR("AllPortSelect not supported");
+ if (port == ALL_PORTS && !(cap_mask & 0x100)) { /* bit 8 is AllPortSelect */
if (all_ports)
all_ports_loop = 1;
+ else
+ IBERROR("AllPortSelect not supported");
}
if (xmt_sl) {
--
1.5.4.5
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [ewg] ibcheckerrors "Port All FAILED" reported
From: Ira Weiny @ 2010-05-06 1:57 UTC (permalink / raw)
To: Woodruff, Robert J,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, EWG,
tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
In-Reply-To: <20100505180943.a9bbb74e.weiny2-i2BcT+NCU+M@public.gmane.org>
Nevermind, I am wrong about the below.
However, there is an option to "emulate" the all ports when it is not supported.
That is a way to fix this I believe.
Ira
On Wed, 5 May 2010 18:09:43 -0700
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> Interesting...
>
> I have a switch which does this as well. Tracing through the scripts shows
> that the perfquery command is failing like this.
>
> 14:29:03 > ./perfquery 40 255
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> It seems there is an issue with the CapabilityMask value...
>
> 14:43:32 > ./perfquery 40 255
> cap_mask 0x400 <=== my debug output
> ./perfquery: iberror: failed: AllPortSelect not supported
>
> 14:43:38 > ./saquery CPI 40
> SA ClassPortInfo:
> ...
> Capability mask..........0x2602
> ...
>
> Those don't match because... perfquery has a bug...
>
> perfquery is issuing a PMA query when it should be issuing a SA query. It
> just so happens that on some switches the result of that PMA query indicates
> AllPortSelect is available. Patch to follow.
>
> Ira
>
>
> On Wed, 5 May 2010 13:47:54 -0700
> "Woodruff, Robert J" <robert.j.woodruff-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> >
> > Hi guys,
> >
> > When I run ibcheckerrors on my Mellanox switch,
> > it is reporting that Port all FAILED.
> >
> > From what I can tell, the switch is working fine and
> > I think that this is a bogus error from the program.
> >
> > If this is indeed not a real problem, can the diagnostic
> > be fixed to not report this as an error ?
> >
> >
> > ibcheckerrors -nocolor -v -t 100
> >
> > # Checking Switch: nodeguid 0x0002c902004046a0
> > Node check lid 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port all: FAILED <------------
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 2: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 3: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 7: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 8: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 9: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 10: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 17: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 18: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 20: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 25: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 26: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 27: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 28: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 34: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 35: OK
> > Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 36: OK
> >
> > Checking Ca: nodeguid 0x0002c9030002628a
> > Node check lid 14: OK
> > Error check on lid 14 (cstnh-2 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300025e0a
> > Node check lid 12: OK
> > Error check on lid 12 (cstnh-3 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030002615e
> > Node check lid 15: OK
> > Error check on lid 15 (cstnh-4 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e442
> > Node check lid 11: OK
> > Error check on lid 11 (cstnh-8 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44e
> > Node check lid 8: OK
> > Error check on lid 8 (cstnh-11 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3e6
> > Node check lid 2: OK
> > Error check on lid 2 (cstnh-13 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e44a
> > Node check lid 18: OK
> > Error check on lid 18 (cstnh-9 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fb4
> > Node check lid 13: OK
> > Error check on lid 13 (cstnh-7 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300044fbc
> > Node check lid 10: OK
> > Error check on lid 10 (cstnh-1 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e3ee
> > Node check lid 9: OK
> > Error check on lid 9 (cstnh-10 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e446
> > Node check lid 4: OK
> > Error check on lid 4 (cstnh-12 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e22e
> > Node check lid 1: OK
> > Error check on lid 1 (cstnh-14 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c9030008e43e
> > Node check lid 19: OK
> > Error check on lid 19 (cstnh-15 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000345
> > Node check lid 6: OK
> > Error check on lid 6 (cstnh-5 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0090270002000335
> > Node check lid 5: OK
> > Error check on lid 5 (cstnh-6 HCA-1) port 1: OK
> >
> > # Checking Ca: nodeguid 0x0002c90300028238
> > Node check lid 3: OK
> > Error check on lid 3 (cst-linux HCA-1) port 1: OK
> >
> > ## Summary: 17 nodes checked, 0 bad nodes found
> > ## 32 ports checked, 0 ports have errors beyond threshold
> > _______________________________________________
> > ewg mailing list
> > ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> > http://**lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> >
>
>
> --
> Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>
--
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: rnfs: rq_respages pointer is bad
From: Tom Tucker @ 2010-05-06 1:35 UTC (permalink / raw)
To: Roland Dreier
Cc: David J. Wilder, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pradeep-r/Jw6+rmf7HQT0dZR+AlfA
In-Reply-To: <ada6332arcw.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
On 5/5/10 5:58 PM, Roland Dreier wrote:
> > The source of the problem is in rdma_read_complete(), I am finding that
> > rqstp->rq_respages is set to point past the end of the rqstp->rq_pages
> > page list. This results in a NULL reference in svc_process() when
> > passing rq_respages[0] to page_address().
>
> did this ever end up getting fixed upstream?
>
I believe it did, but I'll check to be sure. It may be in Bruce's queue too.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [ewg] ibcheckerrors "Port All FAILED" reported
From: Ira Weiny @ 2010-05-06 1:09 UTC (permalink / raw)
To: Woodruff, Robert J,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: EWG, tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org
In-Reply-To: <382A478CAD40FA4FB46605CF81FE39F45685DEAD-osO9UTpF0URzLByeVOV5+bfspsVTdybXVpNB7YpNyf8@public.gmane.org>
Interesting...
I have a switch which does this as well. Tracing through the scripts shows
that the perfquery command is failing like this.
14:29:03 > ./perfquery 40 255
./perfquery: iberror: failed: AllPortSelect not supported
It seems there is an issue with the CapabilityMask value...
14:43:32 > ./perfquery 40 255
cap_mask 0x400 <=== my debug output
./perfquery: iberror: failed: AllPortSelect not supported
14:43:38 > ./saquery CPI 40
SA ClassPortInfo:
...
Capability mask..........0x2602
...
Those don't match because... perfquery has a bug...
perfquery is issuing a PMA query when it should be issuing a SA query. It
just so happens that on some switches the result of that PMA query indicates
AllPortSelect is available. Patch to follow.
Ira
On Wed, 5 May 2010 13:47:54 -0700
"Woodruff, Robert J" <robert.j.woodruff-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> Hi guys,
>
> When I run ibcheckerrors on my Mellanox switch,
> it is reporting that Port all FAILED.
>
> From what I can tell, the switch is working fine and
> I think that this is a bogus error from the program.
>
> If this is indeed not a real problem, can the diagnostic
> be fixed to not report this as an error ?
>
>
> ibcheckerrors -nocolor -v -t 100
>
> # Checking Switch: nodeguid 0x0002c902004046a0
> Node check lid 7: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port all: FAILED <------------
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 2: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 3: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 7: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 8: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 9: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 10: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 17: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 18: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 20: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 25: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 26: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 27: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 28: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 34: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 35: OK
> Error check on lid 7 (Infiniscale-IV Mellanox Technologies) port 36: OK
>
> Checking Ca: nodeguid 0x0002c9030002628a
> Node check lid 14: OK
> Error check on lid 14 (cstnh-2 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300025e0a
> Node check lid 12: OK
> Error check on lid 12 (cstnh-3 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030002615e
> Node check lid 15: OK
> Error check on lid 15 (cstnh-4 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e442
> Node check lid 11: OK
> Error check on lid 11 (cstnh-8 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e44e
> Node check lid 8: OK
> Error check on lid 8 (cstnh-11 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e3e6
> Node check lid 2: OK
> Error check on lid 2 (cstnh-13 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e44a
> Node check lid 18: OK
> Error check on lid 18 (cstnh-9 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300044fb4
> Node check lid 13: OK
> Error check on lid 13 (cstnh-7 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300044fbc
> Node check lid 10: OK
> Error check on lid 10 (cstnh-1 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e3ee
> Node check lid 9: OK
> Error check on lid 9 (cstnh-10 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e446
> Node check lid 4: OK
> Error check on lid 4 (cstnh-12 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e22e
> Node check lid 1: OK
> Error check on lid 1 (cstnh-14 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c9030008e43e
> Node check lid 19: OK
> Error check on lid 19 (cstnh-15 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0090270002000345
> Node check lid 6: OK
> Error check on lid 6 (cstnh-5 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0090270002000335
> Node check lid 5: OK
> Error check on lid 5 (cstnh-6 HCA-1) port 1: OK
>
> # Checking Ca: nodeguid 0x0002c90300028238
> Node check lid 3: OK
> Error check on lid 3 (cst-linux HCA-1) port 1: OK
>
> ## Summary: 17 nodes checked, 0 bad nodes found
> ## 32 ports checked, 0 ports have errors beyond threshold
> _______________________________________________
> ewg mailing list
> ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
> http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
--
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox