Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] ip6mr: Add sizeof verification to MRT6_ASSERT and MT6_PIM
From: David Miller @ 2012-11-26 22:36 UTC (permalink / raw)
  To: joe; +Cc: eric.dumazet, netdev
In-Reply-To: <1353903994.2493.2.camel@joe-AO722>

From: Joe Perches <joe@perches.com>
Date: Sun, 25 Nov 2012 20:26:34 -0800

> Verify the length of the user-space arguments.
> 
> Signed-off-by: Joe Perches <joe@perches.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH] net: ipmr: limit MRT_TABLE identifiers
From: David Miller @ 2012-11-26 22:37 UTC (permalink / raw)
  To: eric.dumazet; +Cc: gang.chen, netdev
In-Reply-To: <1353872669.30446.863.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 25 Nov 2012 11:44:29 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Name of pimreg devices are built from following format :
> 
> char name[IFNAMSIZ]; // IFNAMSIZ == 16
> 
> sprintf(name, "pimreg%u", mrt->id);
> 
> We must therefore limit mrt->id to 9 decimal digits
> or risk a buffer overflow and a crash.
> 
> Restrict table identifiers in [0 ... 999999999] interval.
> 
> Reported-by: Chen Gang <gang.chen@asianux.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* [PATCH for 3.8] iproute2: Add "ip netns pids" and "ip netns identify"
From: Eric W. Biederman @ 2012-11-26 23:16 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Serge E. Hallyn


Add command that go between network namespace names and process
identifiers.  The code builds and runs agains older kernels but
only works on Linux 3.8+ kernels where I have fixed stat to work
properly.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

I don't know if this is too soon to send this patch to iproute as the
kernel code that fixes stat is currently sitting in my for-next branch
of:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git

and has not hit Linus's tree yet.  Still the code runs and is harmless
on older kernels so it should be harmless whatever happens with it.

 ip/ipnetns.c        |  141 +++++++++++++++++++++++++++++++++++++++++++++++++++
 man/man8/ip-netns.8 |    5 ++-
 2 files changed, 145 insertions(+), 1 deletions(-)

diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index e41a598..c55fe3a 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -13,6 +13,7 @@
 #include <dirent.h>
 #include <errno.h>
 #include <unistd.h>
+#include <ctype.h>
 
 #include "utils.h"
 #include "ip_common.h"
@@ -48,6 +49,8 @@ static void usage(void)
 	fprintf(stderr, "Usage: ip netns list\n");
 	fprintf(stderr, "       ip netns add NAME\n");
 	fprintf(stderr, "       ip netns delete NAME\n");
+	fprintf(stderr, "       ip netns identify PID\n");
+	fprintf(stderr, "       ip netns pids NAME\n");
 	fprintf(stderr, "       ip netns exec NAME cmd ...\n");
 	fprintf(stderr, "       ip netns monitor\n");
 	exit(-1);
@@ -171,6 +174,138 @@ static int netns_exec(int argc, char **argv)
 	exit(-1);
 }
 
+static int is_pid(const char *str)
+{
+	int ch;
+	for (; (ch = *str); str++) {
+		if (!isdigit(ch))
+			return 0;
+	}
+	return 1;
+}
+
+static int netns_pids(int argc, char **argv)
+{
+	const char *name;
+	char net_path[MAXPATHLEN];
+	int netns;
+	struct stat netst;
+	DIR *dir;
+	struct dirent *entry;
+
+	if (argc < 1) {
+		fprintf(stderr, "No netns name specified\n");
+		return -1;
+	}
+	if (argc > 1) {
+		fprintf(stderr, "extra arguments specified\n");
+		return -1;
+	}
+
+	name = argv[0];
+	snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
+	netns = open(net_path, O_RDONLY);
+	if (netns < 0) {
+		fprintf(stderr, "Cannot open network namespace: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	if (fstat(netns, &netst) < 0) {
+		fprintf(stderr, "Stat of netns failed: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	dir = opendir("/proc/");
+	if (!dir) {
+		fprintf(stderr, "Open of /proc failed: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	while((entry = readdir(dir))) {
+		char pid_net_path[MAXPATHLEN];
+		struct stat st;
+		if (!is_pid(entry->d_name))
+			continue;
+		snprintf(pid_net_path, sizeof(pid_net_path), "/proc/%s/ns/net",
+			entry->d_name);
+		if (stat(pid_net_path, &st) != 0)
+			continue;
+		if ((st.st_dev == netst.st_dev) &&
+		    (st.st_ino == netst.st_ino)) {
+			printf("%s\n", entry->d_name);
+		}
+	}
+	closedir(dir);
+	return 0;
+	
+}
+
+static int netns_identify(int argc, char **argv)
+{
+	const char *pidstr;
+	char net_path[MAXPATHLEN];
+	int netns;
+	struct stat netst;
+	DIR *dir;
+	struct dirent *entry;
+
+	if (argc < 1) {
+		fprintf(stderr, "No pid specified\n");
+		return -1;
+	}
+	if (argc > 1) {
+		fprintf(stderr, "extra arguments specified\n");
+		return -1;
+	}
+	pidstr = argv[0];
+
+	if (!is_pid(pidstr)) {
+		fprintf(stderr, "Specified string '%s' is not a pid\n",
+			pidstr);
+		return -1;
+	}
+
+	snprintf(net_path, sizeof(net_path), "/proc/%s/ns/net", pidstr);
+	netns = open(net_path, O_RDONLY);
+	if (netns < 0) {
+		fprintf(stderr, "Cannot open network namespace: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	if (fstat(netns, &netst) < 0) {
+		fprintf(stderr, "Stat of netns failed: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	dir = opendir(NETNS_RUN_DIR);
+	if (!dir)
+		return 0;
+
+	while((entry = readdir(dir))) {
+		char name_path[MAXPATHLEN];
+		struct stat st;
+
+		if (strcmp(entry->d_name, ".") == 0)
+			continue;
+		if (strcmp(entry->d_name, "..") == 0)
+			continue;
+
+		snprintf(name_path, sizeof(name_path), "%s/%s",	NETNS_RUN_DIR,
+			entry->d_name);
+
+		if (stat(name_path, &st) != 0)
+			continue;
+
+		if ((st.st_dev == netst.st_dev) &&
+		    (st.st_ino == netst.st_ino)) {
+			printf("%s\n", entry->d_name);
+		}
+	}
+	closedir(dir);
+	return 0;
+	
+}
+
 static int netns_delete(int argc, char **argv)
 {
 	const char *name;
@@ -298,6 +433,12 @@ int do_netns(int argc, char **argv)
 	if (matches(*argv, "delete") == 0)
 		return netns_delete(argc-1, argv+1);
 
+	if (matches(*argv, "identify") == 0)
+		return netns_identify(argc-1, argv+1);
+
+	if (matches(*argv, "pids") == 0)
+		return netns_pids(argc-1, argv+1);
+
 	if (matches(*argv, "exec") == 0)
 		return netns_exec(argc-1, argv+1);
 
diff --git a/man/man8/ip-netns.8 b/man/man8/ip-netns.8
index 349ee7e..e639836 100644
--- a/man/man8/ip-netns.8
+++ b/man/man8/ip-netns.8
@@ -1,4 +1,4 @@
-.TH IP\-NETNS 8 "20 Dec 2011" "iproute2" "Linux"
+.TH IP\-NETNS 8 "26 Dec 2012" "iproute2" "Linux"
 .SH NAME
 ip-netns \- process network namespace management
 .SH SYNOPSIS
@@ -58,6 +58,9 @@ their traditional location in /etc.
 .SS ip netns delete NAME - delete the name of a network namespace
 .SS ip netns exec NAME cmd ... - Run cmd in the named network namespace
 
+.SS ip netns pids NAME - Report processes in the named network namespace
+.SS ip netns identify PID - Report network namespaces names for process
+
 .SH EXAMPLES
 
 .SH SEE ALSO
-- 
1.7.5.4

^ permalink raw reply related

* linux-next: manual merge of the net-next tree with the infiniband tree
From: Stephen Rothwell @ 2012-11-27  0:47 UTC (permalink / raw)
  To: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-next-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ben Hutchings

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in
drivers/net/ethernet/mellanox/mlx4/en_rx.c between commit 08ff32352d6f
("mlx4: 64-byte CQE/EQE support") from the infiniband tree and commit
f1d29a3fa68b ("mlx4_en: Remove remnants of LRO support") from the
net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell                    sfr-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org

diff --cc drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 6fa106f,f76c967..0000000
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@@ -710,12 -709,9 +710,9 @@@ next
  
  		++cq->mcq.cons_index;
  		index = (cq->mcq.cons_index) & ring->size_mask;
 -		cqe = &cq->buf[index];
 +		cqe = &cq->buf[(index << factor) + factor];
- 		if (++polled == budget) {
- 			/* We are here because we reached the NAPI budget -
- 			 * flush only pending LRO sessions */
+ 		if (++polled == budget)
  			goto out;
- 		}
  	}
  
  out:

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
From: Joe Jin @ 2012-11-27  0:59 UTC (permalink / raw)
  To: Fujinaka, Todd
  Cc: Dave, Tushar N, netdev@vger.kernel.org, e1000-devel@lists.sf.net,
	linux-kernel@vger.kernel.org, Mary Mcgrath
In-Reply-To: <9B4A1B1917080E46B64F07F2989DADD62F2D62D6@ORSMSX102.amr.corp.intel.com>

On 11/27/12 00:23, Fujinaka, Todd wrote:
> If you look at the previous section, DevCap, you'll see that it's
> correctly advertising 256 bytes but the system is negotiating 128 for
> the link to the Ethernet controller. Things on the "other" side of the
> link are controlled outside of the e1000 driver.
> 
> Tushar's first suggestion was to check the PCIe payload settings in the
> entire chain. Have you done that? Mismatches will cause hangs.

Hi Todd,

So far I had to know how to modify the maxpayload size, since BIOS have not
entry to change this, so I had to use ethtool, now I need to get the offset
of MaxPayload size in eeprom, I ever tried to find from Intel online document
but failed, any idea?

Thanks in advance,
Joe

^ permalink raw reply

* [PATCH] netfilter updates for net (3.7-rc7)
From: pablo @ 2012-11-27  1:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>

Hi David,

This update contains one patch to fix an overflow via the interface
name attribute in the ipset infrastructure, from Florian Westphal.

You can pull this change from:

git://1984.lsi.us.es/nf master

Thanks!

Florian Westphal (1):
  netfilter: ipset: fix netiface set name overflow

 net/netfilter/ipset/ip_set_hash_netiface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
1.7.10.4


^ permalink raw reply

* [PATCH] netfilter: ipset: fix netiface set name overflow
From: pablo @ 2012-11-27  1:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1353978185-3564-1-git-send-email-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

attribute is copied to IFNAMSIZ-size stack variable,
but IFNAMSIZ is smaller than IPSET_MAXNAMELEN.

Fortunately nfnetlink needs CAP_NET_ADMIN.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_netiface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_hash_netiface.c b/net/netfilter/ipset/ip_set_hash_netiface.c
index b9a6338..45a1014 100644
--- a/net/netfilter/ipset/ip_set_hash_netiface.c
+++ b/net/netfilter/ipset/ip_set_hash_netiface.c
@@ -793,7 +793,7 @@ static struct ip_set_type hash_netiface_type __read_mostly = {
 		[IPSET_ATTR_IP]		= { .type = NLA_NESTED },
 		[IPSET_ATTR_IP_TO]	= { .type = NLA_NESTED },
 		[IPSET_ATTR_IFACE]	= { .type = NLA_NUL_STRING,
-					    .len = IPSET_MAXNAMELEN - 1 },
+					    .len  = IFNAMSIZ - 1 },
 		[IPSET_ATTR_CADT_FLAGS]	= { .type = NLA_U32 },
 		[IPSET_ATTR_CIDR]	= { .type = NLA_U8 },
 		[IPSET_ATTR_TIMEOUT]	= { .type = NLA_U32 },
-- 
1.7.10.4


^ permalink raw reply related

* RE: [net-next 06/10] ixgbe: eliminate Smatch warnings in ixgbe_debugfs.c
From: Hay, Joshua A @ 2012-11-27  1:06 UTC (permalink / raw)
  To: Dan Carpenter, Kirsher, Jeffrey T
  Cc: davem@davemloft.net, netdev@vger.kernel.org, gospo@redhat.com,
	sassmann@redhat.com
In-Reply-To: <20121121110409.GG6186@mwanda>

The return value will be changed to len to preserve error codes returned from simple_write_to_buffer.

However, changing the logic preceding this return breaks these functions.  If simple_write_to_buffer returns a positive value, other actions are performed with this value.  With this patch, the function will return immediately with that value instead.  This will effectively break the ixgbe_debugfs write operations.

So ultimately, the change should be: 
> +	len = simple_write_to_buffer(ixgbe_dbg_reg_ops_buf,
> +				     sizeof(ixgbe_dbg_reg_ops_buf)-1,
> +				     ppos,
> +				     buffer,
> +				     count);
> +	if (len < 0)
> +		return -EFAULT;
	
	if (len < 0)
		return len;

Thanks,
Josh Hay

-----Original Message-----
From: Dan Carpenter [mailto:dan.carpenter@oracle.com] 
Sent: Wednesday, November 21, 2012 3:04 AM
To: Kirsher, Jeffrey T
Cc: davem@davemloft.net; Hay, Joshua A; netdev@vger.kernel.org; gospo@redhat.com; sassmann@redhat.com
Subject: Re: [net-next 06/10] ixgbe: eliminate Smatch warnings in ixgbe_debugfs.c

On Wed, Nov 21, 2012 at 02:47:32AM -0800, Jeff Kirsher wrote:
> +	len = simple_write_to_buffer(ixgbe_dbg_reg_ops_buf,
> +				     sizeof(ixgbe_dbg_reg_ops_buf)-1,
> +				     ppos,
> +				     buffer,
> +				     count);
> +	if (len < 0)
> +		return -EFAULT;

Any negative return is bad.

	if (len)
		return len;

> +
> +	ixgbe_dbg_reg_ops_buf[len] = '\0';
>  
>  	if (strncmp(ixgbe_dbg_reg_ops_buf, "write", 5) == 0) {
>  		u32 reg, value;
> @@ -187,15 +196,15 @@ static ssize_t ixgbe_dbg_netdev_ops_write(struct file *filp,
>  	if (count >= sizeof(ixgbe_dbg_netdev_ops_buf))
>  		return -ENOSPC;
>  
> -	bytes_not_copied = copy_from_user(ixgbe_dbg_netdev_ops_buf,
> -					  buffer, count);
> -	if (bytes_not_copied < 0)
> -		return bytes_not_copied;
> -	else if (bytes_not_copied < count)
> -		count -= bytes_not_copied;
> -	else
> -		return -ENOSPC;
> -	ixgbe_dbg_netdev_ops_buf[count] = '\0';
> +	len = simple_write_to_buffer(ixgbe_dbg_netdev_ops_buf,
> +				     sizeof(ixgbe_dbg_netdev_ops_buf)-1,
> +				     ppos,
> +				     buffer,
> +				     count);
> +	if (len < 0)
> +		return -EFAULT;

Same.

> +
> +	ixgbe_dbg_netdev_ops_buf[len] = '\0';

regards,
dan carpenter

^ permalink raw reply

* Re: private netdev flags into UAPI?
From: David Howells @ 2012-11-27  2:03 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: dhowells, Or Gerlitz, netdev
In-Reply-To: <CAJZOPZLQSCNTfSm63z1P_S4ZmwL+9mbP7rOCH7yHgBFcjaz1UA@mail.gmail.com>

Or Gerlitz <or.gerlitz@gmail.com> wrote:

> On Mon, Nov 26, 2012 at 11:22 AM, David Howells <dhowells@redhat.com> wrote:
> > They were exposed to userspace already
> 
> So the script carries the bug into a new directory... why? AFAIK,
> intentionally there's no way to read private flags from user space, so
> what's the point in defining them there?

How should the script know what's private and what's not?  By the
encapsulation of code inside __KERNEL__ blocks.  In their absence, everything
is assumed to be public - given it is already part of the UAPI.  I don't know
that the code is private rather than the comment is wrong.

David

^ permalink raw reply

* Re: 82571EB: Detected Hardware Unit Hang
From: Mary Mcgrath @ 2012-11-27  2:06 UTC (permalink / raw)
  To: Joe Jin; +Cc: netdev, e1000-devel, linux-kernel
In-Reply-To: <50B41077.3080009@oracle.com>

Joe
Thank you for working this.
I would love to find out how they expect a customer to make the modification
To  "word  0x1A, and see if the 8th bit is 0 or 1, and to change to 0."

I have in turn asked the ct for the lspci command on eth3, maybe the incorrect setting is upstream.

Again,  thank you.
Regards
Mary



-----Original Message-----
From: Joe Jin 
Sent: Monday, November 26, 2012 8:00 PM
To: Fujinaka, Todd
Cc: Dave, Tushar N; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; Mary Mcgrath
Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang

On 11/27/12 00:23, Fujinaka, Todd wrote:
> If you look at the previous section, DevCap, you'll see that it's 
> correctly advertising 256 bytes but the system is negotiating 128 for 
> the link to the Ethernet controller. Things on the "other" side of the 
> link are controlled outside of the e1000 driver.
> 
> Tushar's first suggestion was to check the PCIe payload settings in 
> the entire chain. Have you done that? Mismatches will cause hangs.

Hi Todd,

So far I had to know how to modify the maxpayload size, since BIOS have not entry to change this, so I had to use ethtool, now I need to get the offset of MaxPayload size in eeprom, I ever tried to find from Intel online document but failed, any idea?

Thanks in advance,
Joe

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH net-next v2] net: clean up locking in inet_frag_find()
From: Cong Wang @ 2012-11-27  3:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Patrick McHardy, Pablo Neira Ayuso, David S. Miller
In-Reply-To: <1353942751.30446.1769.camel@edumazet-glaptop>

On Mon, 2012-11-26 at 07:12 -0800, Eric Dumazet wrote:
> On Mon, 2012-11-26 at 15:26 +0800, Cong Wang wrote:
> > It is weird to take the read lock outside of inet_frag_find()
> > but release it inside...  This can be improved by refactoring
> > the code, that is, introducing inet{4,6}_frag_find() which call
> > the their own hash function, inet{4,6}_hash_frag(), hiding the
> > details from their callers.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Cong Wang <amwang@redhat.com>
> > 
> > ---
> >  include/net/inet_frag.h                 |   17 +++++-
> >  include/net/ipv6.h                      |    3 -
> >  net/ipv4/inet_fragment.c                |   82 +++++++++++++++++++++++++++++--
> >  net/ipv4/ip_fragment.c                  |   16 +-----
> >  net/ipv6/netfilter/nf_conntrack_reasm.c |    7 +--
> >  net/ipv6/reassembly.c                   |   34 +------------
> >  6 files changed, 97 insertions(+), 62 deletions(-)
> 
> 
> Not clear to me its a win, as it adds 35 LOC. Nobody really complained
> of this locking schem in the past.

Yeah, seems every people here is able to read any ugly code, except
me. :-)

> 
> Also Jesper is working on this stuff, so you dont really ease its work.
> 
> 

I will rebase his tree for him, no worry, handling conflicts is part of
my life everyday (I am a heavy `git rebase -i` user).

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Shan Wei @ 2012-11-27  3:17 UTC (permalink / raw)
  To: Chen Gang; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50AEEF08.4000707@asianux.com>

Chen Gang said, at 2012/11/23 11:35:
> 2) about %*s:
>  since kernel is an open system, IFNAMSIZ is belong to OS API level for outside
>    it has effect both on individual kernel modules and user mode system call
>    we need obey this rule, and %8s is not match this rule.
>    so %8s is not suitable. (and now we have to choose %16s or %s).

Your patch will change the format of /proc/net/ipv6_route.
Why we need to keep be consistent with user mode?
However user operates device name, no effect on the showing of /proc/net/ipv6_route.

> 
>  for the format of information which seq_printf output:
>    it is not belong to OS API level for outside (at least, for current case, it is true). 
>    so we need not keep 'compatible' of it, so %16s is not necessary.

Can you explain If we don't change to %s, what will happen?

> 
>  for keeping source code simple and clearly:
>    %s is better than %16s.
> 
>  so for result, we should choose %s only (neither %16s nor %8s).

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-27  4:18 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B430B5.1080700@gmail.com>

于 2012年11月27日 11:17, Shan Wei 写道:
> Chen Gang said, at 2012/11/23 11:35:
>> 2) about %*s:
>>  since kernel is an open system, IFNAMSIZ is belong to OS API level for outside
>>    it has effect both on individual kernel modules and user mode system call
>>    we need obey this rule, and %8s is not match this rule.
>>    so %8s is not suitable. (and now we have to choose %16s or %s).
> 
> Your patch will change the format of /proc/net/ipv6_route.

  Yes, it will be changed.
  although it belongs to "User Experience", it is not belong to os api level.
    for os api level: we must commit them not be changed (they are testament)
    for User Experience: we can change it, but maybe users feel 'not good'.

> Why we need to keep be consistent with user mode?

  it is for "keep source code simple and clear"
  when others see the %8s, easy to make them miss understanding (not quite clear)
  so better to change it to %s.


> However user operates device name, no effect on the showing of /proc/net/ipv6_route.

  now, no effect.


all together:
  since we are not user interactive program,
    "keeping source code simple and clear" is more important than "User Experience"


> 
>>
>>  for the format of information which seq_printf output:
>>    it is not belong to OS API level for outside (at least, for current case, it is true). 
>>    so we need not keep 'compatible' of it, so %16s is not necessary.
> 
> Can you explain If we don't change to %s, what will happen?
> 

  for outside, nothing will happen.

  so it is not for correctness, it is only for "keep source code simple and clear".

>>
>>  for keeping source code simple and clearly:
>>    %s is better than %16s.
>>
>>  so for result, we should choose %s only (neither %16s nor %8s).
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-27  4:45 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B43F12.1090709@asianux.com>

于 2012年11月27日 12:18, Chen Gang 写道:
> 于 2012年11月27日 11:17, Shan Wei 写道:
>> > Chen Gang said, at 2012/11/23 11:35:
>>> >> 2) about %*s:
>>> >>  since kernel is an open system, IFNAMSIZ is belong to OS API level for outside
>>> >>    it has effect both on individual kernel modules and user mode system call
>>> >>    we need obey this rule, and %8s is not match this rule.
>>> >>    so %8s is not suitable. (and now we have to choose %16s or %s).
>> > 
>> > Your patch will change the format of /proc/net/ipv6_route.
>   Yes, it will be changed.
>   although it belongs to "User Experience", it is not belong to os api level.
>     for os api level: we must commit them not be changed (they are testament)
>     for User Experience: we can change it, but maybe users feel 'not good'.
> 
  I think (only for my thought, maybe not correct):
    for user input through /proc/* are all for os api level, not for "User Experience".
    for most of outputs to user through /proc/*, are "User Experience".

>> > Why we need to keep be consistent with user mode?
>   it is for "keep source code simple and clear"
>   when others see the %8s, easy to make them miss understanding (not quite clear)
>   so better to change it to %s.
> 
> 
>> > However user operates device name, no effect on the showing of /proc/net/ipv6_route.
>   now, no effect.
> 
> 
> all together:
>   since we are not user interactive program,
>     "keeping source code simple and clear" is more important than "User Experience"
> 
> 
>> > 
>>> >>
>>> >>  for the format of information which seq_printf output:
>>> >>    it is not belong to OS API level for outside (at least, for current case, it is true). 
>>> >>    so we need not keep 'compatible' of it, so %16s is not necessary.
>> > 
>> > Can you explain If we don't change to %s, what will happen?
>> > 
>   for outside, nothing will happen.
> 
>   so it is not for correctness, it is only for "keep source code simple and clear".
> 
>>> >>
>>> >>  for keeping source code simple and clearly:
>>> >>    %s is better than %16s.
>>> >>
>>> >>  so for result, we should choose %s only (neither %16s nor %8s).
>> > 
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe netdev" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > 
>> > 
> 
> -- Chen Gang Asianux Corporation -- To unsubscribe from this list: send
> the line "unsubscribe netdev" in the body of a message to
> majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Shan Wei @ 2012-11-27  4:56 UTC (permalink / raw)
  To: Chen Gang; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B43F12.1090709@asianux.com>

Chen Gang said, at 2012/11/27 12:18:
>>
>>>
>>>  for the format of information which seq_printf output:
>>>    it is not belong to OS API level for outside (at least, for current case, it is true). 
>>>    so we need not keep 'compatible' of it, so %16s is not necessary.
>>
>> Can you explain If we don't change to %s, what will happen?
>>
> 
>   for outside, nothing will happen.
> 
>   so it is not for correctness, it is only for "keep source code simple and clear".

So, it's a clean-up type patch which is just for developer,
but with the change of /proc interface which is for user.
user is first, so let us end this thread unless you have necessary reasons to do it. 

Thanks  
Shan Wei

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-27  5:35 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B447F3.2090806@gmail.com>

于 2012年11月27日 12:56, Shan Wei 写道:
> Chen Gang said, at 2012/11/27 12:18:
>>>
>>>>
>>>>  for the format of information which seq_printf output:
>>>>    it is not belong to OS API level for outside (at least, for current case, it is true). 
>>>>    so we need not keep 'compatible' of it, so %16s is not necessary.
>>>
>>> Can you explain If we don't change to %s, what will happen?
>>>
>>
>>   for outside, nothing will happen.
>>
>>   so it is not for correctness, it is only for "keep source code simple and clear".
> 
> So, it's a clean-up type patch which is just for developer,
> but with the change of /proc interface which is for user.
> user is first, so let us end this thread unless you have necessary reasons to do it. 
> 

1)  it is not change the /proc interface.
    a) both %8s and %s do not change the output interface format (including contents, topology, separator mark, space redundancy).
    b) it is belong to 'User Experience', not belong to os api.
    c) do you agree with what I say above ?

2)  I think:
    one of the differences between system service and user interactive program are:
      for system service (including kernel): "clean-up" is more important than "UE".
      for user interactive program:          "UE" is more important than "Clean-up".
    maybe, it is for ideal world (or maybe it is only in theory).
    (also maybe what I said above is incorrect)


3)  so all together:
    I can understand if it is not integrated into main branch.
    it will be better to continue discussing it in ideal world (or in theory), I think it is valuable for learning with each other.

  :-)

  thanks.

gchen.

> Thanks  
> Shan Wei
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-27  5:40 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B44563.8060905@asianux.com>

>   I think (only for my thought, maybe not correct):
>     for user input through /proc/* are all for os api level, not for "User Experience".
>     for most of outputs to user through /proc/*, are "User Experience".
> 

  and now I think what I said above is incorrect.

and now, I think:
  A) both input and output through /proc/* are for os api level.
  B) but both %8s and %s do not change the output interface format (including contents, topology, separator mark, space redundancy).
  C) so it is belong to 'User Experience', not belong to os api.

  welcome any another members to giving suggestions and completions.

  thanks.

  :-)

-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* linux-next: manual merge of the drop-experimental tree with the net-next tree
From: Stephen Rothwell @ 2012-11-27  5:45 UTC (permalink / raw)
  To: Kees Cook; +Cc: linux-next, linux-kernel, Ben Hutchings, David Miller, netdev

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

Hi Kees,

Today's linux-next merge of the drop-experimental tree got a conflict in
net/dsa/Kconfig between commit b3422a314c27 ("dsa: Hide core config
options; make drivers select what they need") from the net-next tree and
commit b8e8d99e4ee8 ("net/dsa: remove depends on CONFIG_EXPERIMENTAL")
from the drop-experimental tree.

I fixed it up (using the net-next version as that removed
CONFIG_EXPERIMENTAL as well) and can carry the fix as necessary (no
action is required).

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] vhost: fix length for cross region descriptor
From: Jason Wang @ 2012-11-27  5:47 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, David Miller, linux-kernel
In-Reply-To: <20121126155727.GA21716@redhat.com>

On 11/26/2012 11:57 PM, Michael S. Tsirkin wrote:
> If a single descriptor crosses a region, the
> second chunk length should be decremented
> by size translated so far, instead it includes
> the full descriptor length.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   drivers/vhost/vhost.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index ef8f598..5a3d0f1 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1049,7 +1049,7 @@ static int translate_desc(struct vhost_dev *dev, u64 addr, u32 len,
>   		}
>   		_iov = iov + ret;
>   		size = reg->memory_size - addr + reg->guest_phys_addr;
> -		_iov->iov_len = min((u64)len, size);
> +		_iov->iov_len = min((u64)len - s, size);
>   		_iov->iov_base = (void __user *)(unsigned long)
>   			(reg->userspace_addr + addr - reg->guest_phys_addr);
>   		s += size;

Acked-by: Jason Wang <jasowang@redhat.com>

^ permalink raw reply

* Re: Fwd: Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-27  5:52 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50B45265.9070303@asianux.com>

于 2012年11月27日 13:40, Chen Gang 写道:
> 
> and now, I think:
>   A) both input and output through /proc/* are for os api level.
>   B) but both %8s and %s do not change the output interface format (including contents, topology, separator mark, space redundancy).
>   C) so it is belong to 'User Experience', not belong to os api.
> 
>   welcome any another members to giving suggestions and completions.
> 
>   thanks.
> 
>   :-)
> 

  completion: 8 right alignment is not belong to interface format.
    if it was belong to interface format,
    it would cause correctness issue (the name len may be larger than 8).
    so if "8 right alignment" is belong to os api, it means the api is not correct, need change.

  :-)

-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: performance regression on HiperSockets depending on MTU size
From: Cong Wang @ 2012-11-27  6:21 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1353946351.30446.1779.camel@edumazet-glaptop>

On Mon, 26 Nov 2012 at 16:12 GMT, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hi Frank, thanks for this report.
>
> You could tweak tcp_limit_output_bytes, but IMO the root of the problem
> is in the driver itself.
>
> For example, I had to change mlx4 driver for the same problem : Make
> sure a TX packet can be "TX completed" in a short amount of time.
>
> In the case of mlx4, the wait time was 128 us, but I suspect on your
> case its more like an infinite time or several ms.
>  
> The driver is delaying the free of TX skb by a fixed amount of time,
> or relies on following transmits to perform the TX completion
>

Eric,

Do you have a full list of such commits? I am trying to backport TSQ
to 2.6.32, and of course I don't want to miss these commits either.

Thanks!

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the infiniband tree
From: Or Gerlitz @ 2012-11-27  6:43 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: David Miller, netdev, linux-next, linux-kernel, Roland Dreier,
	linux-rdma, Ben Hutchings, Amir Vadai
In-Reply-To: <20121127114751.6961da02245ed6851190aca2@canb.auug.org.au>

On 27/11/2012 02:47, Stephen Rothwell wrote:
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in
> drivers/net/ethernet/mellanox/mlx4/en_rx.c between commit 08ff32352d6f
> ("mlx4: 64-byte CQE/EQE support") from the infiniband tree and commit
> f1d29a3fa68b ("mlx4_en: Remove remnants of LRO support") from the
> net-next tree.
>
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).
>

Acked-by: Or Geritz <ogerlitz@mellanox.com>

^ permalink raw reply

* Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent
From: Jason Wang @ 2012-11-27  6:45 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: mst, netdev, linux-kernel, virtualization, davem
In-Reply-To: <20121126093728.0a10f97f@nehalam.linuxnetplumber.net>

On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> On Mon, 26 Nov 2012 15:56:52 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> Some deivces do not free the old tx skbs immediately after it has been sent
>> (usually in tx interrupt). One such example is virtio-net which optimizes for
>> virt and only free the possible old tx skbs during the next packet sending. This
>> would lead the pktgen to wait forever in the refcount of the skb if no other
>> pakcet will be sent afterwards.
>>
>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
>> notify the pktgen that the device does not free skb immediately after it has
>> been sent and let it not to wait for the refcount to be one.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> Another alternative would be using skb_orphan() and skb->destructor.
> There are other cases where skb's are not freed right away.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hi Stephen:

Do you mean registering a skb->destructor for pktgen then set and check 
bits in skb->tx_flag?

^ permalink raw reply

* Re: performance regression on HiperSockets depending on MTU size
From: Eric Dumazet @ 2012-11-27  6:46 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev
In-Reply-To: <k91m5l$77a$1@ger.gmane.org>

On Tue, 2012-11-27 at 06:21 +0000, Cong Wang wrote:

> Eric,
> 
> Do you have a full list of such commits? I am trying to backport TSQ
> to 2.6.32, and of course I don't want to miss these commits either.

I dont think there are other known issues.

mlx4 had a 'problem' because only recently we removed the skb_orphan()
call it used to do in its start_xmit() function.

I remember David had to revert BQL on NIU driver, but NIU does the
skb_orphan() call as well so TSQ is basically disabled.

^ permalink raw reply

* Re: private netdev flags into UAPI?
From: Or Gerlitz @ 2012-11-27  6:51 UTC (permalink / raw)
  To: David Howells; +Cc: Or Gerlitz, netdev
In-Reply-To: <990.1353981789@warthog.procyon.org.uk>

On 27/11/2012 04:03, David Howells wrote:
> Or Gerlitz <or.gerlitz@gmail.com> wrote:
>
>> On Mon, Nov 26, 2012 at 11:22 AM, David Howells <dhowells@redhat.com> wrote:
>>> They were exposed to userspace already
>> So the script carries the bug into a new directory... why? AFAIK,
>> intentionally there's no way to read private flags from user space, so
>> what's the point in defining them there?
> How should the script know what's private and what's not?  By the
> encapsulation of code inside __KERNEL__ blocks.  In their absence, everything
> is assumed to be public - given it is already part of the UAPI.  I don't know
> that the code is private rather than the comment is wrong.
>
>

makes sense, but I have pointed on a bug in the final result, so this 
way or another, the fact that the bug
existed before doesn't mean we should carry it over.

Or.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox