From: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>
To: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Hal Rosenstock
<hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] libibnetdisc: fix outstanding SMPs countung
Date: Fri, 16 Apr 2010 15:05:05 +0300 [thread overview]
Message-ID: <20100416120505.GB11943@me> (raw)
In-Reply-To: <0EEE4F40-F1DD-46A6-B756-3C46DA06B403-i2BcT+NCU+M@public.gmane.org>
On 09:52 Wed 14 Apr , Ira Weiny wrote:
>
> > But then it blocks process_mads() to loop forever after single
> > send_smp() failure (with all empty queues and umad_recv() running
> > without timeout).
>
> But moving the cl_qmap_insert below the send call fixes that.
It doesn't:
int process_mads(smp_engine_t * engine)
{
int rc = 0;
while (engine->num_smps_outstanding > 0) {
if ((rc = process_smp_queue(engine)) != 0)
return rc;
while (!cl_is_qmap_empty(&engine->smps_on_wire))
if ((rc = process_one_recv(engine)) != 0)
return rc;
}
return 0;
}
After send_smp() failure engine->num_smps_outstanding still be > 0 and
will be never decreased (tested).
> However, it does cause a memory leak because the smp is no longer in
> the smp_queue_head list.
This is correct about leaking.
> It needs to be put back on that list to be
> retried with a limit on the retries (to prevent what you are saying
> here.)
We have already retries mechanism implemented in umad_send(), so likely
failed MAD should be just dropped and freed:
diff --git a/infiniband-diags/libibnetdisc/src/query_smp.c b/infiniband-diags/libibnetdisc/src/query_smp.c
index 08e3ef7..89c0b05 100644
--- a/infiniband-diags/libibnetdisc/src/query_smp.c
+++ b/infiniband-diags/libibnetdisc/src/query_smp.c
@@ -96,8 +96,10 @@ static int process_smp_queue(smp_engine_t * engine)
if (!smp)
return 0;
- if ((rc = send_smp(smp, engine->ibmad_port)) != 0)
+ if ((rc = send_smp(smp, engine->ibmad_port)) != 0) {
+ free(smp);
return rc;
+ }
engine->num_smps_outstanding++;
cl_qmap_insert(&engine->smps_on_wire, (uint32_t) smp->rpc.trid,
(cl_map_item_t *) smp);
> Are you seeing a hang?
I'm seeing endless loop.
> I have seen a hang when running "iblinkinfo -S <guid>".
What do you mean "hang"? Endless loop?
> However, the
> problem is not with send_smp. I am seeing the mad going on the wire
> and returning (according to madeye) but I am not receiving it from
> umad_recv. I don't know why. If I run with 1 outstanding mad it
> works???
Do you see this with current master (for me 'iblinkinfo -S' works fine,
but I have only two switches).
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-04-16 12:05 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-18 20:49 [PATCH v3 1/2] libibnetdisc: Convert to a multi-smp algorithm Ira Weiny
[not found] ` <20100218124933.c018a23d.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-04-13 10:46 ` Sasha Khapyorsky
2010-04-13 13:25 ` Sasha Khapyorsky
2010-04-13 20:30 ` Ira Weiny
[not found] ` <20100413133028.b55a0cb1.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-04-14 9:58 ` Sasha Khapyorsky
2010-04-23 2:12 ` Ira Weiny
[not found] ` <20100422191231.bf6021fb.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-05-07 20:50 ` Sasha Khapyorsky
[not found] ` <20100507205052.GW7099-o14lFNPAa+WKTadZzrrH2Q@public.gmane.org>
2010-05-10 20:53 ` Ira Weiny
[not found] ` <20100510135353.257d76c0.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-05-11 16:42 ` Sasha Khapyorsky
2010-05-11 23:44 ` Ira Weiny
2010-04-13 16:53 ` Sasha Khapyorsky
2010-04-13 16:58 ` [PATCH] libibnetdisc: don't try to cross discovery over CA Sasha Khapyorsky
2010-04-13 17:07 ` [PATCH v3 1/2] libibnetdisc: Convert to a multi-smp algorithm Ira Weiny
2010-04-13 17:18 ` Sasha Khapyorsky
2010-04-13 18:24 ` Ira Weiny
[not found] ` <20100413112412.de66586d.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-04-14 9:59 ` [PATCH] libibnetdisc: don't query CA ports not connected to a fabric Sasha Khapyorsky
2010-04-13 16:38 ` [PATCH] libibnetdisc: fix outstanding SMPs countung Sasha Khapyorsky
2010-04-13 20:38 ` Ira Weiny
[not found] ` <20100413133826.00a8afc5.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-04-13 20:44 ` Ira Weiny
[not found] ` <20100413134446.72eb336a.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-04-14 10:23 ` Sasha Khapyorsky
2010-04-14 16:52 ` Ira Weiny
[not found] ` <0EEE4F40-F1DD-46A6-B756-3C46DA06B403-i2BcT+NCU+M@public.gmane.org>
2010-04-16 12:05 ` Sasha Khapyorsky [this message]
2010-04-16 12:21 ` [PATCH] libibnetdisc: fix memory leak in case of send_smps() failure Sasha Khapyorsky
2010-04-18 15:49 ` [PATCH] libibnetdisc: fix outstanding SMPs countung Sasha Khapyorsky
2010-04-18 15:56 ` [PATCH] libibnetdiscover: more outstanding MADs counting fix Sasha Khapyorsky
2010-04-18 16:03 ` [PATCH] libibnetdisc: remove not needed process_smp_queue() call Sasha Khapyorsky
2010-04-18 16:10 ` [PATCH] libibnetdisc: remove not needed num_smps_outstanding counter Sasha Khapyorsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100416120505.GB11943@me \
--to=sashak-smomgflxvozwk0htik3j/w@public.gmane.org \
--cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=weiny2-i2BcT+NCU+M@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox