netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Steve Wise" <swise@opengridcomputing.com>
To: "'Leon Romanovsky'" <leon@kernel.org>
Cc: <dsahern@gmail.com>, <stephen@networkplumber.org>,
	<netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH v1 iproute2-next 1/4] rdma: add helper rd_sendrecv_msg()
Date: Mon, 4 Mar 2019 08:13:04 -0600	[thread overview]
Message-ID: <007901d4d294$62d5d280$28817780$@opengridcomputing.com> (raw)
In-Reply-To: <20190303135052.GY15253@mtr-leonro.mtl.com>



> > >
> > > On 2/23/2019 3:26 AM, Leon Romanovsky wrote:
> > > > On Thu, Feb 21, 2019 at 08:19:03AM -0800, Steve Wise wrote:
> > > >> This function sends the constructed netlink message and then
> > > >> receives the response, displaying any error text.
> > > >>
> > > >> Change 'rdma dev set' to use it.
> > > >>
> > > >> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > > >> ---
> > > >>  rdma/dev.c   |  2 +-
> > > >>  rdma/rdma.h  |  1 +
> > > >>  rdma/utils.c | 21 +++++++++++++++++++++
> > > >>  3 files changed, 23 insertions(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/rdma/dev.c b/rdma/dev.c
> > > >> index 60ff4b31e320..d2949c378f08 100644
> > > >> --- a/rdma/dev.c
> > > >> +++ b/rdma/dev.c
> > > >> @@ -273,7 +273,7 @@ static int dev_set_name(struct rd *rd)
> > > >>  	mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_DEV_INDEX, rd-
> > > >dev_idx);
> > > >>  	mnl_attr_put_strz(rd->nlh, RDMA_NLDEV_ATTR_DEV_NAME,
> > > rd_argv(rd));
> > > >>
> > > >> -	return rd_send_msg(rd);
> > > >> +	return rd_sendrecv_msg(rd, seq);
> > > >>  }
> > > >>
> > > >>  static int dev_one_set(struct rd *rd)
> > > >> diff --git a/rdma/rdma.h b/rdma/rdma.h
> > > >> index 547bb5749a39..20be2f12c4f8 100644
> > > >> --- a/rdma/rdma.h
> > > >> +++ b/rdma/rdma.h
> > > >> @@ -115,6 +115,7 @@ bool rd_check_is_key_exist(struct rd *rd, const
> > > char *key);
> > > >>   */
> > > >>  int rd_send_msg(struct rd *rd);
> > > >>  int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data,
uint32_t
> > > seq);
> > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq);
> > > >>  void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq,
> > uint16_t
> > > flags);
> > > >>  int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data);
> > > >>  int rd_attr_cb(const struct nlattr *attr, void *data);
> > > >> diff --git a/rdma/utils.c b/rdma/utils.c
> > > >> index 069d44fece10..a6f2826c9605 100644
> > > >> --- a/rdma/utils.c
> > > >> +++ b/rdma/utils.c
> > > >> @@ -664,6 +664,27 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t
> callback,
> > > void *data, unsigned int seq)
> > > >>  	return ret;
> > > >>  }
> > > >>
> > > >> +static int null_cb(const struct nlmsghdr *nlh, void *data)
> > > >> +{
> > > >> +	return MNL_CB_OK;
> > > >> +}
> > > >> +
> > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq)
> > > >> +{
> > > >> +	int ret;
> > > >> +
> > > >> +	ret = rd_send_msg(rd);
> > > >> +	if (ret) {
> > > >> +		perror(NULL);
> > > > This is more or less already done in rd_send_msg() and that function
> > > > prints something in case of execution error. So the missing piece
> > > > is to update rd_recv_msg(), so all places will "magically" print
errors
> > > > and not only dev_set_name().
> > >
> > > Yea ok.
> > >
> >
> > dev_set_name() doesn't call rd_recv_msg().  So you're suggesting I fix
up
> > rd_recv_msg() to display errors and make dev_set_name() call
> rd_recv_msg()
> > with the null_cb function?  You sure that's the way to go?
> 
> I'm sure that we need to fix dev_set_name(), everything else I'm not sure.
> 
> Thanks

Hey Leon, adding this to rd_recv_msg():

@@ -693,10 +693,28 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void
*data, unsigned int seq)
                ret = mnl_cb_run(buf, ret, seq, portid, callback, data);
        } while (ret > 0);

+       if (ret < 0)
+               perror(NULL);
+
        mnl_socket_close(rd->nl);
        return ret;
 }

Results in unexpected errors being logged when doing a query such as:

[root@stevo1 iproute2]# ./rdma/rdma res show qp lqpn 176
error: Invalid argument
link mlx5_0/1 lqpn 176 type UD state RTS sq-psn 0 comm [ib_core]
error: Invalid argument
error: No such file or directory
error: Invalid argument
error: No such file or directory

It appears the "invalid argument" errors are due to rdmatool sending a
RDMA_NLDEV_CMD_RES_QP_GET command using the doit kernel method to allow
querying for just a QP with lqpn = 176.  However, rdmatool isn't passing a
port index in the messages that generate the "invalid argument" error from
the kernel.  IE you must provide a device index and port index when issuing
a doit command vs a dumpit command.  I think. 

This error was not found because rd_recv_msg() never displayed any errors
previously.  Further, the RES_FUNC() massive macro has code that will retry
a failed doit call with a dumpit call.  I think _##name() should distinguish
between failures reported by the kernel doit function vs failures because no
doit function exists.  Not sure how to support that.


        static inline int _##name(struct rd *rd)
\
        {
\
                uint32_t idx;
\
                int ret;
\
                if (id) {
\
                        ret = rd_doit_index(rd, &idx);
\
                        if (ret) {
\
                                ret = _res_send_idx_msg(rd, command,
\
                                                        name##_idx_parse_cb,
\
                                                        idx, id);
\
                                if (!ret)
\
                                        return ret;
\
                                /* Fallback for old systems without .doit
callbacks */ \
                        }
\
                }
\
                return _res_send_msg(rd, command, name##_parse_cb);
\
        }
\



The "no such file or dir" errors are being returned because, in my setup,
there are 2 other links that do not have lqpn 176.   So there are 2 issues
uncovered by adding generic printing of errors in rd_recv_msg()

1) the doit code in rdmatool is generating requests for a doit method in the
kernel w/o providing a port index.
2) some paths in rdmatool should not print "benign" errors like filtering on
a GET command causing a "does not exist" error returned by the kernel doit
func.

#1 is a bug, IMO.  Can you propose a fix?
#2 could be solved by adding an error callback func passed to rd_recv_msg().
Then the RES_FUNC() functions could parse errors like "no such file or dir"
when doing a filtered query and silently drop them.  And functions like
dev_set_name() would display all errors returned because there are no
expected errors other than "success".

Steve.



  reply	other threads:[~2019-03-04 14:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21 18:22 [PATCH v1 iproute2-next 0/4] Dynamic rdma link creation Steve Wise
2019-02-21 16:19 ` [PATCH v1 iproute2-next 1/4] rdma: add helper rd_sendrecv_msg() Steve Wise
2019-02-23  9:26   ` Leon Romanovsky
2019-02-23  9:31     ` Leon Romanovsky
2019-02-26 17:19       ` Steve Wise
2019-02-26 19:16         ` Leon Romanovsky
2019-02-26 19:55           ` Steve Wise
2019-02-26 19:55         ` David Ahern
2019-02-26 17:13     ` Steve Wise
2019-02-27 20:23       ` Steve Wise
2019-03-03 13:50         ` Leon Romanovsky
2019-03-04 14:13           ` Steve Wise [this message]
2019-03-06 21:50             ` Steve Wise
2019-03-07  8:33               ` Leon Romanovsky
2019-02-28 19:41     ` Steve Wise
2019-02-28 19:56       ` Leon Romanovsky
2019-02-28 20:10         ` Steve Wise
2019-02-21 16:19 ` [PATCH v1 iproute2-next 2/4] Sync up rdma_netlink.h Steve Wise
2019-02-21 18:56   ` Stephen Hemminger
2019-02-21 22:10     ` Leon Romanovsky
2019-02-21 23:11     ` Jason Gunthorpe
2019-02-23  9:28   ` Leon Romanovsky
2019-02-26 17:15     ` Steve Wise
2019-02-21 16:19 ` [PATCH v1 iproute2-next 3/4] rdma: add 'link add/delete' commands Steve Wise
2019-02-21 23:14   ` Jason Gunthorpe
2019-02-23  9:43   ` Leon Romanovsky
2019-02-26 17:24     ` Steve Wise
2019-02-27 21:18       ` Steve Wise
2019-02-21 16:19 ` [PATCH v1 iproute2-next 4/4] rdma: man page update for link add/delete Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='007901d4d294$62d5d280$28817780$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    --cc=dsahern@gmail.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).