Re: [patch iproute2 v8 1/2] lib/libnetlink: Add functions rtnl_talk_msg and rtnl_talk_iov

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Ahern <dsahern@gmail.com>
To: Chris Mi <chrism@mellanox.com>, netdev@vger.kernel.org
Cc: gerlitz.or@gmail.com, stephen@networkplumber.org,
	marcelo.leitner@gmail.com, phil@nwl.cc
Subject: Re: [patch iproute2 v8 1/2] lib/libnetlink: Add functions rtnl_talk_msg and rtnl_talk_iov
Date: Wed, 10 Jan 2018 12:20:36 -0700	[thread overview]
Message-ID: <adb47d43-6e49-b159-e95b-9da77306234d@gmail.com> (raw)
In-Reply-To: <20180110032742.8127-2-chrism@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 3618 bytes --]

On 1/9/18 8:27 PM, Chris Mi wrote:
> rtnl_talk can only send a single message to kernel. Add two functions
> rtnl_talk_msg and rtnl_talk_iov that can send multiple messages to kernel.
> rtnl_talk_msg takes struct msghdr * as argument.
> rtnl_talk_iov takes struct iovec * and iovlen as arguments.
> 
> Signed-off-by: Chris Mi <chrism@mellanox.com>
> ---
>  include/libnetlink.h |  6 ++++
>  lib/libnetlink.c     | 82 ++++++++++++++++++++++++++++++++++++++++------------
>  2 files changed, 70 insertions(+), 18 deletions(-)
> 
> diff --git a/include/libnetlink.h b/include/libnetlink.h
> index a4d83b9e..e9a63dbc 100644
> --- a/include/libnetlink.h
> +++ b/include/libnetlink.h
> @@ -96,6 +96,12 @@ int rtnl_dump_filter_nc(struct rtnl_handle *rth,
>  int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
>  	      struct nlmsghdr **answer)
>  	__attribute__((warn_unused_result));
> +int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
> +		  struct nlmsghdr **answer)
> +	__attribute__((warn_unused_result));

As mentioned before rtnl_talk_msg is not needed; you only need to add
rtnl_talk_iov. The attached fixup on top of your patch removes it and
adjusts __rtnl_talk_iov. Please roll that change into your patch.


While testing this I noticed 2 other oddities:

$ perf trace -s tc -b tc.batch
(stddev column removed to shorten line width)

 Summary of events:

 tc (780), 1857 events, 97.9%

   syscall            calls    total       min       avg       max
                               (msec)    (msec)    (msec)    (msec)
   --------------- -------- --------- --------- --------- ---------
   recvmsg              530     6.532     0.008     0.012     0.218
   open                 269     5.429     0.012     0.020     0.117
   sendmsg                4     3.518     0.092     0.879     1.647



1. recvmsg is called twice - once to peek at message size, allocate a
buffer and then really receive the message. That is overkill for ACKs.

2. I am using a batch file with drop filters:

filter add dev eth2 ingress protocol ip pref 273 flower dst_ip
192.168.253.0/16 action drop

and for each command tc is trying to dlopen m_drop.so:

open("/usr/lib/tc//m_drop.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such
file or directory)


With a patch to use a stack buffer for ACKs, the above perf summary becomes:

$ perf trace -s tc -b tc.batch

 Summary of events:

 tc (777), 1345 events, 97.1%

   syscall            calls    total       min       avg       max
                               (msec)    (msec)    (msec)    (msec)
   --------------- -------- --------- --------- --------- ---------
   open                 269     5.510     0.013     0.020     0.160
   recvmsg              274     3.758     0.009     0.014     0.396
   sendmsg                4     3.531     0.098     0.883     1.672


Making the open errors now the dominate overhead affecting performance.
If tc had some smarts that it already tried that file it would avoid the
subsequent open calls. The end result is a significant speed up compared
to the current tc:

 Summary of events:

 tc (785), 2333 events, 98.3%

   syscall            calls    total       min       avg       max
                               (msec)    (msec)    (msec)    (msec)
   --------------- -------- --------- --------- --------- ---------
   sendmsg              256     9.832     0.029     0.038     0.181
   open                 269     5.819     0.013     0.022     0.353
   recvmsg              530     5.592     0.009     0.011     0.285


Can you look at a follow on patch (not part of this set) to cache status
of dlopen attempts?

[-- Attachment #2: rtnl_talk_fixup.patch --]
[-- Type: text/plain, Size: 3173 bytes --]

diff --git a/include/libnetlink.h b/include/libnetlink.h
index e9a63dbca259..d6322190afaa 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -96,9 +96,6 @@ int rtnl_dump_filter_nc(struct rtnl_handle *rth,
 int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	      struct nlmsghdr **answer)
 	__attribute__((warn_unused_result));
-int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
-		  struct nlmsghdr **answer)
-	__attribute__((warn_unused_result));
 int rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec *iovec, size_t iovlen,
 		  struct nlmsghdr **answer)
 	__attribute__((warn_unused_result));
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index 183825c7fe0e..cb1990fcbf1c 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -581,38 +581,40 @@ static void rtnl_talk_error(struct nlmsghdr *h, struct nlmsgerr *err,
 		strerror(-err->error));
 }
 
-static int __rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
-			   struct nlmsghdr **answer,
+
+static int __rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec *iov,
+			   size_t iovlen, struct nlmsghdr **answer,
 			   bool show_rtnl_err, nl_ext_ack_fn_t errfn)
 {
 	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
-	int i, status, iovlen = m->msg_iovlen;
-	struct iovec iov;
+	int i, status;
+	struct iovec riov;
 	struct msghdr msg = {
 		.msg_name = &nladdr,
 		.msg_namelen = sizeof(nladdr),
-		.msg_iov = &iov,
-		.msg_iovlen = 1,
+		.msg_iov = iov,
+		.msg_iovlen = iovlen,
 	};
 	unsigned int seq = 0;
 	struct nlmsghdr *h;
 	char *buf;
 
 	for (i = 0; i < iovlen; i++) {
-		struct iovec *v;
-		v = &m->msg_iov[i];
-		h = v->iov_base;
+		h = iov[i].iov_base;
 		h->nlmsg_seq = seq = ++rtnl->seq;
 		if (answer == NULL)
 			h->nlmsg_flags |= NLM_F_ACK;
 	}
 
-	status = sendmsg(rtnl->fd, m, 0);
+	status = sendmsg(rtnl->fd, &msg, 0);
 	if (status < 0) {
 		perror("Cannot talk to rtnetlink");
 		return -1;
 	}
 
+	/* change msg to use the response iov */
+	msg.msg_iov = &riov;
+	msg.msg_iovlen = 1;
 	i = 0;
 	while (1) {
 next:
@@ -705,21 +707,6 @@ static int __rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
 	}
 }
 
-static int __rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec *msg_iov,
-			   size_t iovlen, struct nlmsghdr **answer,
-			   bool show_rtnl_err, nl_ext_ack_fn_t errfn)
-{
-	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
-	struct msghdr msg = {
-		.msg_name = &nladdr,
-		.msg_namelen = sizeof(nladdr),
-		.msg_iov = msg_iov,
-		.msg_iovlen = iovlen,
-	};
-
-	return __rtnl_talk_msg(rtnl, &msg, answer, show_rtnl_err, errfn);
-}
-
 static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 		       struct nlmsghdr **answer,
 		       bool show_rtnl_err, nl_ext_ack_fn_t errfn)
@@ -738,12 +725,6 @@ int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	return __rtnl_talk(rtnl, n, answer, true, NULL);
 }
 
-int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
-		  struct nlmsghdr **answer)
-{
-	return __rtnl_talk_msg(rtnl, m, answer, true, NULL);
-}
-
 int rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec *iovec, size_t iovlen,
 		  struct nlmsghdr **answer)
 {

next prev parent reply	other threads:[~2018-01-10 19:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10  3:27 [patch iproute2 v8 0/2] tc: Add batchsize feature to batch mode Chris Mi
2018-01-10  3:27 ` [patch iproute2 v8 1/2] lib/libnetlink: Add functions rtnl_talk_msg and rtnl_talk_iov Chris Mi
2018-01-10 19:20   ` David Ahern [this message]
2018-01-10 20:12     ` Phil Sutter
2018-01-11 15:08       ` Phil Sutter
2018-01-12  0:06         ` David Ahern
2018-01-11  5:34     ` Chris Mi
2018-01-10  3:27 ` [patch iproute2 v8 2/2] tc: Add batchsize feature for filter and actions Chris Mi
2018-01-10 11:42   ` Marcelo Ricardo Leitner
2018-01-11  5:32     ` Chris Mi
2018-01-10 19:41   ` David Ahern
2018-01-11  5:39     ` Chris Mi

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:e9a63dbca25 dfblob:d6322190afa dfblob:183825c7fe0
dfblob:cb1990fcbf1 )
 OR (
bs:"Re: [patch iproute2 v8 1/2] lib/libnetlink: Add functions rtnl_talk_msg and rtnl_talk_iov" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adb47d43-6e49-b159-e95b-9da77306234d@gmail.com \
    --to=dsahern@gmail.com \
    --cc=chrism@mellanox.com \
    --cc=gerlitz.or@gmail.com \
    --cc=marcelo.leitner@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=phil@nwl.cc \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).