Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH iproute2 4/4] tc: Allow to easy change network namespace
From: Jiri Pirko @ 2014-12-14  9:36 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418493334-23142-5-git-send-email-vadim4j@gmail.com>

Sat, Dec 13, 2014 at 06:55:34PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
>    ip netns exec NETNS tc OPTIONS COMMAND OBJECT
>
>    to
>
>    tc -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
>    tc -net vnet0 qdisc
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* Re: [PATCH iproute2 3/4] bridge: Allow to easy change network namespace
From: Jiri Pirko @ 2014-12-14  9:36 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418493334-23142-4-git-send-email-vadim4j@gmail.com>

Sat, Dec 13, 2014 at 06:55:33PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
>    ip netns exec NETNS bridge OPTIONS COMMAND OBJECT
>
>    to
>
>    bridge -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
>    bridge -net vnet0 fdb
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* Re: [PATCH iproute2 2/4] ip: Allow to easy change network namespace
From: Jiri Pirko @ 2014-12-14  9:36 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418493334-23142-3-git-send-email-vadim4j@gmail.com>

Sat, Dec 13, 2014 at 06:55:32PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
>    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>
>    to
>
>    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
>    ip -net vnet0 link add br0 type bridge
>    ip -n vnet0 link
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* Re: [PATCH iproute2 1/4] lib: Add netns_switch func for change network namespace
From: Jiri Pirko @ 2014-12-14  9:35 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418493334-23142-2-git-send-email-vadim4j@gmail.com>

Sat, Dec 13, 2014 at 06:55:31PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>New netns_switch func moved to the lib/namespace.c from ip/ipnetns.c
>so it can be used from the other tools for fast switching
>network namespace.
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* [PATCH] cirrus: cs89x0: fix time comparison
From: Asaf Vertz @ 2014-12-14  8:34 UTC (permalink / raw)
  To: davem; +Cc: ebiederm, julia.lawall, himangi774, asaf.vertz, netdev,
	linux-kernel

To be future-proof and for better readability the time comparisons are
modified to use time_before, time_after, and time_after_eq instead of
plain, error-prone math.

Signed-off-by: Asaf Vertz <asaf.vertz@tandemg.com>
---
 drivers/net/ethernet/cirrus/cs89x0.c |   27 ++++++++++++++-------------
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c
index 9823a0e..965331f 100644
--- a/drivers/net/ethernet/cirrus/cs89x0.c
+++ b/drivers/net/ethernet/cirrus/cs89x0.c
@@ -60,6 +60,7 @@
 #include <linux/interrupt.h>
 #include <linux/ioport.h>
 #include <linux/in.h>
+#include <linux/jiffies.h>
 #include <linux/skbuff.h>
 #include <linux/spinlock.h>
 #include <linux/string.h>
@@ -238,13 +239,13 @@ writereg(struct net_device *dev, u16 regno, u16 value)
 static int __init
 wait_eeprom_ready(struct net_device *dev)
 {
-	int timeout = jiffies;
+	unsigned long timeout = jiffies;
 	/* check to see if the EEPROM is ready,
 	 * a timeout is used just in case EEPROM is ready when
 	 * SI_BUSY in the PP_SelfST is clear
 	 */
 	while (readreg(dev, PP_SelfST) & SI_BUSY)
-		if (jiffies - timeout >= 40)
+		if (time_after_eq(jiffies, timeout + 40))
 			return -1;
 	return 0;
 }
@@ -485,7 +486,7 @@ control_dc_dc(struct net_device *dev, int on_not_off)
 {
 	struct net_local *lp = netdev_priv(dev);
 	unsigned int selfcontrol;
-	int timenow = jiffies;
+	unsigned long timenow = jiffies;
 	/* control the DC to DC convertor in the SelfControl register.
 	 * Note: This is hooked up to a general purpose pin, might not
 	 * always be a DC to DC convertor.
@@ -499,7 +500,7 @@ control_dc_dc(struct net_device *dev, int on_not_off)
 	writereg(dev, PP_SelfCTL, selfcontrol);
 
 	/* Wait for the DC/DC converter to power up - 500ms */
-	while (jiffies - timenow < HZ)
+	while (time_before(jiffies, timenow + HZ))
 		;
 }
 
@@ -514,7 +515,7 @@ send_test_pkt(struct net_device *dev)
 		0, 0,		/* DSAP=0 & SSAP=0 fields */
 		0xf3, 0		/* Control (Test Req + P bit set) */
 	};
-	long timenow = jiffies;
+	unsigned long timenow = jiffies;
 
 	writereg(dev, PP_LineCTL, readreg(dev, PP_LineCTL) | SERIAL_TX_ON);
 
@@ -525,10 +526,10 @@ send_test_pkt(struct net_device *dev)
 	iowrite16(ETH_ZLEN, lp->virt_addr + TX_LEN_PORT);
 
 	/* Test to see if the chip has allocated memory for the packet */
-	while (jiffies - timenow < 5)
+	while (time_before(jiffies, timenow + 5))
 		if (readreg(dev, PP_BusST) & READY_FOR_TX_NOW)
 			break;
-	if (jiffies - timenow >= 5)
+	if (time_after_eq(jiffies, timenow + 5))
 		return 0;	/* this shouldn't happen */
 
 	/* Write the contents of the packet */
@@ -536,7 +537,7 @@ send_test_pkt(struct net_device *dev)
 
 	cs89_dbg(1, debug, "Sending test packet ");
 	/* wait a couple of jiffies for packet to be received */
-	for (timenow = jiffies; jiffies - timenow < 3;)
+	for (timenow = jiffies; time_before(jiffies, timenow + 3);)
 		;
 	if ((readreg(dev, PP_TxEvent) & TX_SEND_OK_BITS) == TX_OK) {
 		cs89_dbg(1, cont, "succeeded\n");
@@ -556,7 +557,7 @@ static int
 detect_tp(struct net_device *dev)
 {
 	struct net_local *lp = netdev_priv(dev);
-	int timenow = jiffies;
+	unsigned long timenow = jiffies;
 	int fdx;
 
 	cs89_dbg(1, debug, "%s: Attempting TP\n", dev->name);
@@ -574,7 +575,7 @@ detect_tp(struct net_device *dev)
 	/* Delay for the hardware to work out if the TP cable is present
 	 * - 150ms
 	 */
-	for (timenow = jiffies; jiffies - timenow < 15;)
+	for (timenow = jiffies; time_before(jiffies, timenow + 15);)
 		;
 	if ((readreg(dev, PP_LineST) & LINK_OK) == 0)
 		return DETECTED_NONE;
@@ -618,7 +619,7 @@ detect_tp(struct net_device *dev)
 		if ((lp->auto_neg_cnf & AUTO_NEG_BITS) == AUTO_NEG_ENABLE) {
 			pr_info("%s: negotiating duplex...\n", dev->name);
 			while (readreg(dev, PP_AutoNegST) & AUTO_NEG_BUSY) {
-				if (jiffies - timenow > 4000) {
+				if (time_after(jiffies, timenow + 4000)) {
 					pr_err("**** Full / half duplex auto-negotiation timed out ****\n");
 					break;
 				}
@@ -1271,7 +1272,7 @@ static void __init reset_chip(struct net_device *dev)
 {
 #if !defined(CONFIG_MACH_MX31ADS)
 	struct net_local *lp = netdev_priv(dev);
-	int reset_start_time;
+	unsigned long reset_start_time;
 
 	writereg(dev, PP_SelfCTL, readreg(dev, PP_SelfCTL) | POWER_ON_RESET);
 
@@ -1294,7 +1295,7 @@ static void __init reset_chip(struct net_device *dev)
 	/* Wait until the chip is reset */
 	reset_start_time = jiffies;
 	while ((readreg(dev, PP_SelfST) & INIT_DONE) == 0 &&
-	       jiffies - reset_start_time < 2)
+	       time_before(jiffies, reset_start_time + 2))
 		;
 #endif /* !CONFIG_MACH_MX31ADS */
 }
-- 
1.7.0.4

^ permalink raw reply related

* Re: [PATCH net] net/mlx4_en: correct the endianness of doorbell_qpn on big endian platform
From: David Miller @ 2014-12-14  4:43 UTC (permalink / raw)
  To: weiyang; +Cc: David.Laight, eric.dumazet, netdev, gideonn, edumazet, amirv
In-Reply-To: <20141213031338.GA12208@richard>

From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Sat, 13 Dec 2014 11:13:38 +0800

> On Mon, Dec 08, 2014 at 10:42:37PM +0800, Wei Yang wrote:
> If you prefer this way, I would like to send a new version for this.
> Is it ok for you?

I'm not so sure.  There are implications when using the __raw_*()
routines.

In particular, using __raw_{read,write}l() also means that the usual
necessary I/O memory barriers are not being performed.

There are therefore no ordering guarantees between __raw_*() and other
I/O or memory accesses whatsoever.

^ permalink raw reply

* Re: [WTF?] random test in netlink_sendmsg()
From: David Miller @ 2014-12-14  4:38 UTC (permalink / raw)
  To: viro; +Cc: kaber, netdev, dmitry.tarnyagin
In-Reply-To: <20141213045133.GI22149@ZenIV.linux.org.uk>

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Sat, 13 Dec 2014 04:51:33 +0000

> On Sat, Dec 13, 2014 at 03:25:00AM +0000, Al Viro wrote:
>> 	    msg->msg_iter.type == KVEC_ITER &&
> 
> ITER_IOVEC, that is.  And that way it even works...  Are you OK with the
> commit below?

No objection, looks perfectly fine.

^ permalink raw reply

* charity message
From: luv2charitys @ 2014-12-14  2:58 UTC (permalink / raw)
  To: netdev

Hello,this is Mr Paul N,i sent you an email on charity work but i am yet
to hear fom you,do reply with this code CHA-2015 to my email address
paulcharity@qq.com  i Look forward to hearing from you this time,God
bless  Brother Paul 

^ permalink raw reply

* Re: [bisected] tg3 broken in 3.18.0?
From: Nils Holland @ 2014-12-13 21:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-pci, rajatxjain

rajatxjain@gmail.com
Bcc: 
Subject: Re: [bisected] tg3 broken in 3.18.0?
Reply-To: 
In-Reply-To: <20141212.201831.186234837340644301.davem@davemloft.net>

On Fri, Dec 12, 2014 at 08:18:31PM -0500, David Miller wrote:
> From: Nils Holland <nholland@tisys.org>
> Date: Sat, 13 Dec 2014 02:14:08 +0100
> 
> > 
> > My bisect exercise suggests that the following commit is the culprit:
> > 
> > 89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
> > ID to identify Configuration Request Retry)
> 
> You definitely need to bring this up with the author of that change
> and the relevent list for the PCI subsystem and/or linux-kernel.

I've now already sent an inquiry to Rajat Jain, the author of the
patch in question, and this message here is now also CC'd to
linux-pci@.

With this message, I'd like to add one last result of investigation
I've done today, in the hope that it will aid the folks with more
knowledge to go after the issue.

Basically, I've added a little debug output to tg3.c in the function
tg3_poll_fw(), as that function contained the code that would print
out the "No firmware running" line that was visible in dmesg on those
kernels where tg3 would not work for me. So, I basically had this:

static int tg3_poll_fw(struct tg3 *tp)
{
        int i;
        u32 val;

        netdev_info(tp->dev, "XX: Boom!\n");
        [...]
}

Now, I was looking through dmesg searching for occurances of this
debug output, using a standard 3.18.0 kernel (where my tg3 doesn't
work) as well as using a 3.18.0 kernel with
89665a6a71408796565bfd29cfa6a7877b17a667 reverted (where my tg3
works). Here's the results:

[standard 3.18.0 (=problematic)]:
[    2.197653] libphy: tg3 mdio bus: probed
[    2.257488] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    2.259589] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    2.261740] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    2.263912] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[   10.028002] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[   10.028247] tg3 0000:02:00.0 enp2s0: XX: Boom!
[   12.157034] tg3 0000:02:00.0 enp2s0: No firmware running


[3.18.0 without above mentioned patch, 3.17.3 is the same, both result
in a working tg3]:
[    1.397167] libphy: tg3 mdio bus: probed
[    1.456473] tg3 0000:02:00.0
        (unnamed net_device) (uninitialized): XX: Boom!
[    1.464987] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    1.467118] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    1.469311] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    1.471500] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[    9.631629] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[    9.631962] tg3 0000:02:00.0 enp2s0: XX: Boom!
[    9.634339] tg3 0000:02:00.0 enp2s0: XX: Boom!
[    9.642741] IPv6:
        ADDRCONF(NETDEV_UP): enp2s0: link is not ready
[   10.479636] tg3 0000:02:00.0
        enp2s0: Link is down
[   11.484498] tg3 0000:02:00.0
        enp2s0: Link is up at 100 Mbps, full duplex

As can be seen, there are two tg3-related sections in my dmesg in both
the working and non-working scenarios: At about 1 - 2 secs, the card
seems to begin initializing, and at about 9 - 10 seconds it is (or
should be) ready to establish a network connection.

My debug section, or tg3.c's tg3_poll_fw(), seems to be called thrice
in the working situation: The first hit occurs at 1.456473 where the tg3
device is still reported as "(unnamed net_device) (uninitialized)".
Then, the section gets hit twice again at around 9.63 - at this point
the driver already reports the card as initialized / by its real name.

In the non-working situation, the debug sections seems to be hit only
once, at 10.028247. At this point, the tg3 is already reported as
initialized - just like when it's hit the second and third time in the
working situation.

Bottom line is that commit 89665a6a71408796565bfd29cfa6a7877b17a667
really makes a difference regarding the way the tg3 card is
initialized, which seems to cause the problem.

Greetings,
Nils

^ permalink raw reply

* Re: Multicast packets being lost (3.10 stable)
From: David Miller @ 2014-12-13 20:37 UTC (permalink / raw)
  To: linus.luessing; +Cc: openwrt-devel, netdev, bridge, gregkh, shemming
In-Reply-To: <20141210191633.GA2473@odroid>

From: Linus Lüssing <linus.luessing@c0d3.blue>
Date: Wed, 10 Dec 2014 20:16:33 +0100

> did you have a chance to look into backporting these fixes for
> stable yet?

I am not submitting -stable fixes back to 3.10 any longer, at most
I am doing 4 -stable releases and right now that is 3.18, 3.17,
v3.14, and v3.12

^ permalink raw reply

* Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
From: Julian Anastasov @ 2014-12-13 20:19 UTC (permalink / raw)
  To: Smart Weblications GmbH - Florian Wiessner
  Cc: Steffen Klassert, netdev, LKML, stable, Simon Horman, lvs-devel
In-Reply-To: <5489A465.6090305@smart-weblications.de>


	Hello,

On Thu, 11 Dec 2014, Smart Weblications GmbH - Florian Wiessner wrote:

> >> [  512.485323] CPU: 4 PID: 28142 Comm: vsftpd Not tainted 3.12.33 #5
> >
> > 	Above "#5" is same as previous oops. It means kernel
> > is not updated. Or you updated only the IPVS modules after
> > applying the both patches?
> 
> I did it with make-kpkg --initrd linux_image which only rebuilt the modules,
> correct. I can retry with make clean before building the package

	I just tested PASV and PORT with 3.12.33 including
both patches (seq adj fix + ip_route_me_harder fix) and do not
see any crashes in nf_ct_seqadj_set. If you still have problem
with FTP send me more info offlist.

> > 	You can also try without FTP tests to see if there
> > are oopses in xfrm, so that we can close this topic and then
> > to continue for the FTP problem on IPVS lists without
> > bothering non-IPVS people.
> >
> 
> yeah, it seems that the xfrm issue is away.

	Thanks for the confirmation!

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [PATCH iproute2 4/4] tc: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
  To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>

From: Vadim Kochan <vadim4j@gmail.com>

Added new '-netns' option to simplify executing following cmd:

    ip netns exec NETNS tc OPTIONS COMMAND OBJECT

    to

    tc -n[etns] NETNS OPTIONS COMMAND OBJECT

e.g.:

    tc -net vnet0 qdisc

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
 man/man8/tc.8 | 65 +++++++++++++++++++++++++++++++++++++++++++++--------------
 tc/Makefile   |  5 +++++
 tc/tc.c       |  8 +++++++-
 3 files changed, 62 insertions(+), 16 deletions(-)

diff --git a/man/man8/tc.8 b/man/man8/tc.8
index 8d794de..d8f974f 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -2,7 +2,9 @@
 .SH NAME
 tc \- show / manipulate traffic control settings
 .SH SYNOPSIS
-.B tc qdisc [ add | change | replace | link | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B qdisc [ add | change | replace | link | delete ] dev
 DEV
 .B
 [ parent
@@ -13,7 +15,9 @@ qdisc-id ] qdisc
 [ qdisc specific parameters ]
 .P
 
-.B tc class [ add | change | replace | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B class [ add | change | replace | delete ] dev
 DEV
 .B parent
 qdisc-id
@@ -22,7 +26,9 @@ class-id ] qdisc
 [ qdisc specific parameters ]
 .P
 
-.B tc filter [ add | change | replace | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B filter [ add | change | replace | delete ] dev
 DEV
 .B [ parent
 qdisc-id
@@ -35,21 +41,28 @@ priority filtertype
 flow-id
 
 .B tc
+.RI "[ " OPTIONS " ]"
 .RI "[ " FORMAT " ]"
 .B qdisc show [ dev
 DEV
 .B ]
 .P
 .B tc
+.RI "[ " OPTIONS " ]"
 .RI "[ " FORMAT " ]"
 .B class show dev
 DEV
 .P
-.B tc filter show dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B filter show dev
 DEV
 
 .P
-.B tc [ -force ] -b\fR[\fIatch\fR] \fB[ filename ]
+.ti 8
+.IR OPTIONS " := {"
+\fB[ -force ] -b\fR[\fIatch\fR] \fB[ filename ] \fR|
+\fB[ \fB-n\fR[\fIetns\fR] name \fB] \fR}
 
 .ti 8
 .IR FORMAT " := {"
@@ -407,6 +420,38 @@ link
 Only available for qdiscs and performs a replace where the node
 must exist already.
 
+.SH OPTIONS
+
+.TP
+.BR "\-b", " \-b filename", " \-batch", " \-batch filename"
+read commands from provided file or standard input and invoke them.
+First failure will cause termination of tc.
+
+.TP
+.BR "\-force"
+don't terminate tc on errors in batch mode.
+If there were any errors during execution of the commands, the application return code will be non zero.
+
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B tc
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B tc
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B tc
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
 .SH FORMAT
 The show command has additional formatting options:
 
@@ -430,16 +475,6 @@ decode filter offset and mask values to equivalent filter commands based on TCP/
 .BR "\-iec"
 print rates in IEC units (ie. 1K = 1024).
 
-.TP
-.BR "\-b", " \-b filename", " \-batch", " \-batch filename"
-read commands from provided file or standard input and invoke them.
-First failure will cause termination of tc.
-
-.TP
-.BR "\-force"
-don't terminate tc on errors in batch mode.
-If there were any errors during execution of the commands, the application return code will be non zero.
-
 .SH HISTORY
 .B tc
 was written by Alexey N. Kuznetsov and added in Linux 2.2.
diff --git a/tc/Makefile b/tc/Makefile
index 1ab36c6..536ed88 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -3,6 +3,11 @@ TCOBJ= tc.o tc_qdisc.o tc_class.o tc_filter.o tc_util.o \
        m_ematch.o emp_ematch.yacc.o emp_ematch.lex.o
 
 include ../Config
+
+ifeq ($(IP_CONFIG_SETNS),y)
+	CFLAGS += -DHAVE_SETNS
+endif
+
 SHARED_LIBS ?= y
 
 TCMODULES :=
diff --git a/tc/tc.c b/tc/tc.c
index 9b50e74..ea4ba10 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -29,6 +29,7 @@
 #include "utils.h"
 #include "tc_util.h"
 #include "tc_common.h"
+#include "namespace.h"
 
 int show_stats = 0;
 int show_details = 0;
@@ -185,7 +186,8 @@ static void usage(void)
 	fprintf(stderr, "Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }\n"
 			"       tc [-force] -batch filename\n"
 	                "where  OBJECT := { qdisc | class | filter | action | monitor }\n"
-	                "       OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] }\n");
+	                "       OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | "
+			"-n[etns] name }\n");
 }
 
 static int do_cmd(int argc, char **argv)
@@ -293,6 +295,10 @@ int main(int argc, char **argv)
 			if (argc <= 1)
 				usage();
 			batch_file = argv[1];
+		} else if (matches(argv[1], "-netns") == 0) {
+			NEXT_ARG();
+			if (netns_switch(argv[1]))
+				return -1;
 		} else {
 			fprintf(stderr, "Option \"%s\" is unknown, try \"tc -help\".\n", argv[1]);
 			return -1;
-- 
2.1.3

^ permalink raw reply related

* [PATCH iproute2 3/4] bridge: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
  To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>

From: Vadim Kochan <vadim4j@gmail.com>

Added new '-netns' option to simplify executing following cmd:

    ip netns exec NETNS bridge OPTIONS COMMAND OBJECT

    to

    bridge -n[etns] NETNS OPTIONS COMMAND OBJECT

e.g.:

    bridge -net vnet0 fdb

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
 bridge/Makefile   |  4 ++++
 bridge/bridge.c   |  7 ++++++-
 man/man8/bridge.8 | 23 ++++++++++++++++++++++-
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/bridge/Makefile b/bridge/Makefile
index 1fb8320..9800753 100644
--- a/bridge/Makefile
+++ b/bridge/Makefile
@@ -2,6 +2,10 @@ BROBJ = bridge.o fdb.o monitor.o link.o mdb.o vlan.o
 
 include ../Config
 
+ifeq ($(IP_CONFIG_SETNS),y)
+	CFLAGS += -DHAVE_SETNS
+endif
+
 all: bridge
 
 bridge: $(BROBJ) $(LIBNETLINK) 
diff --git a/bridge/bridge.c b/bridge/bridge.c
index ee08f90..5fcc552 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -13,6 +13,7 @@
 #include "SNAPSHOT.h"
 #include "utils.h"
 #include "br_common.h"
+#include "namespace.h"
 
 struct rtnl_handle rth = { .fd = -1 };
 int preferred_family = AF_UNSPEC;
@@ -31,7 +32,7 @@ static void usage(void)
 "Usage: bridge [ OPTIONS ] OBJECT { COMMAND | help }\n"
 "where  OBJECT := { link | fdb | mdb | vlan | monitor }\n"
 "       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] |\n"
-"                    -o[neline] | -t[imestamp] \n");
+"                    -o[neline] | -t[imestamp] | -n[etns] name }\n");
 	exit(-1);
 }
 
@@ -112,6 +113,10 @@ main(int argc, char **argv)
 			preferred_family = AF_INET;
 		} else if (strcmp(opt, "-6") == 0) {
 			preferred_family = AF_INET6;
+		} else if (matches(opt, "-netns") == 0) {
+			NEXT_ARG();
+			if (netns_switch(argv[1]))
+				exit(-1);
 		} else {
 			fprintf(stderr, "Option \"%s\" is unknown, try \"bridge help\".\n", opt);
 			exit(-1);
diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index af31d41..cb3fb46 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -19,7 +19,8 @@ bridge \- show / manipulate bridge addresses and devices
 .ti -8
 .IR OPTIONS " := { "
 \fB\-V\fR[\fIersion\fR] |
-\fB\-s\fR[\fItatistics\fR] }
+\fB\-s\fR[\fItatistics\fR] |
+\fB\-n\fR[\fIetns\fR] name }
 
 .ti -8
 .BR "bridge link set"
@@ -112,6 +113,26 @@ output more information.  If this option
 is given multiple times, the amount of information increases.
 As a rule, the information is statistics or some time values.
 
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B bridge
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B bridge
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B bridge
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
 
 .SH BRIDGE - COMMAND SYNTAX
 
-- 
2.1.3

^ permalink raw reply related

* [PATCH iproute2 2/4] ip: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
  To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>

From: Vadim Kochan <vadim4j@gmail.com>

Added new '-netns' option to simplify executing following cmd:

    ip netns exec NETNS ip OPTIONS COMMAND OBJECT

    to

    ip -n[etns] NETNS OPTIONS COMMAND OBJECT

e.g.:

    ip -net vnet0 link add br0 type bridge
    ip -n vnet0 link

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
 ip/ip.c       |  7 ++++++-
 man/man8/ip.8 | 23 ++++++++++++++++++++++-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/ip/ip.c b/ip/ip.c
index 5f759d5..96e64a3 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -22,6 +22,7 @@
 #include "SNAPSHOT.h"
 #include "utils.h"
 #include "ip_common.h"
+#include "namespace.h"
 
 int preferred_family = AF_UNSPEC;
 int human_readable = 0;
@@ -54,7 +55,7 @@ static void usage(void)
 "                    -4 | -6 | -I | -D | -B | -0 |\n"
 "                    -l[oops] { maximum-addr-flush-attempts } |\n"
 "                    -o[neline] | -t[imestamp] | -b[atch] [filename] |\n"
-"                    -rc[vbuf] [size]}\n");
+"                    -rc[vbuf] [size] | -n[etns] name }\n");
 	exit(-1);
 }
 
@@ -262,6 +263,10 @@ int main(int argc, char **argv)
 			rcvbuf = size;
 		} else if (matches(opt, "-help") == 0) {
 			usage();
+		} else if (matches(opt, "-netns") == 0) {
+			NEXT_ARG();
+			if (netns_switch(argv[1]))
+				exit(-1);
 		} else {
 			fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
 			exit(-1);
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 2d42e98..0bae59e 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
 \fB\-r\fR[\fIesolve\fR] |
 \fB\-f\fR[\fIamily\fR] {
 .BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
-\fB\-o\fR[\fIneline\fR] }
+\fB\-o\fR[\fIneline\fR] |
+\fB\-n\fR[\fIetns\fR] name }
 
 
 .SH OPTIONS
@@ -134,6 +135,26 @@ the output.
 use the system's name resolver to print DNS names instead of
 host addresses.
 
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B ip
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B ip
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B ip
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
 .SH IP - COMMAND SYNTAX
 
 .SS
-- 
2.1.3

^ permalink raw reply related

* [PATCH iproute2 1/4] lib: Add netns_switch func for change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
  To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>

From: Vadim Kochan <vadim4j@gmail.com>

New netns_switch func moved to the lib/namespace.c from ip/ipnetns.c
so it can be used from the other tools for fast switching
network namespace.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
 include/namespace.h |  46 +++++++++++++++++++++++
 ip/ipnetns.c        | 106 ++--------------------------------------------------
 lib/Makefile        |   6 ++-
 lib/namespace.c     |  86 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 140 insertions(+), 104 deletions(-)
 create mode 100644 include/namespace.h
 create mode 100644 lib/namespace.c

diff --git a/include/namespace.h b/include/namespace.h
new file mode 100644
index 0000000..2f13e65
--- /dev/null
+++ b/include/namespace.h
@@ -0,0 +1,46 @@
+#ifndef __NAMESPACE_H__
+#define __NAMESPACE_H__ 1
+
+#include <sched.h>
+#include <sys/mount.h>
+#include <errno.h>
+
+#define NETNS_RUN_DIR "/var/run/netns"
+#define NETNS_ETC_DIR "/etc/netns"
+
+#ifndef CLONE_NEWNET
+#define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */
+#endif
+
+#ifndef MNT_DETACH
+#define MNT_DETACH	0x00000002	/* Just detach from the tree */
+#endif /* MNT_DETACH */
+
+/* sys/mount.h may be out too old to have these */
+#ifndef MS_REC
+#define MS_REC		16384
+#endif
+
+#ifndef MS_SLAVE
+#define MS_SLAVE	(1 << 19)
+#endif
+
+#ifndef MS_SHARED
+#define MS_SHARED	(1 << 20)
+#endif
+
+#ifndef HAVE_SETNS
+static int setns(int fd, int nstype)
+{
+#ifdef __NR_setns
+	return syscall(__NR_setns, fd, nstype);
+#else
+	errno = ENOSYS;
+	return -1;
+#endif
+}
+#endif /* HAVE_SETNS */
+
+extern int netns_switch(char *netns);
+
+#endif /* __NAMESPACE_H__ */
diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 1c8aa02..519d518 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -17,42 +17,7 @@
 
 #include "utils.h"
 #include "ip_common.h"
-
-#define NETNS_RUN_DIR "/var/run/netns"
-#define NETNS_ETC_DIR "/etc/netns"
-
-#ifndef CLONE_NEWNET
-#define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */
-#endif
-
-#ifndef MNT_DETACH
-#define MNT_DETACH	0x00000002	/* Just detach from the tree */
-#endif /* MNT_DETACH */
-
-/* sys/mount.h may be out too old to have these */
-#ifndef MS_REC
-#define MS_REC		16384
-#endif
-
-#ifndef MS_SLAVE
-#define MS_SLAVE	(1 << 19)
-#endif
-
-#ifndef MS_SHARED
-#define MS_SHARED	(1 << 20)
-#endif
-
-#ifndef HAVE_SETNS
-static int setns(int fd, int nstype)
-{
-#ifdef __NR_setns
-	return syscall(__NR_setns, fd, nstype);
-#else
-	errno = ENOSYS;
-	return -1;
-#endif
-}
-#endif /* HAVE_SETNS */
+#include "namespace.h"
 
 static int usage(void)
 {
@@ -101,42 +66,12 @@ static int netns_list(int argc, char **argv)
 	return 0;
 }
 
-static void bind_etc(const char *name)
-{
-	char etc_netns_path[MAXPATHLEN];
-	char netns_name[MAXPATHLEN];
-	char etc_name[MAXPATHLEN];
-	struct dirent *entry;
-	DIR *dir;
-
-	snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
-	dir = opendir(etc_netns_path);
-	if (!dir)
-		return;
-
-	while ((entry = readdir(dir)) != NULL) {
-		if (strcmp(entry->d_name, ".") == 0)
-			continue;
-		if (strcmp(entry->d_name, "..") == 0)
-			continue;
-		snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
-		snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
-		if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
-			fprintf(stderr, "Bind %s -> %s failed: %s\n",
-				netns_name, etc_name, strerror(errno));
-		}
-	}
-	closedir(dir);
-}
-
 static int netns_exec(int argc, char **argv)
 {
 	/* Setup the proper environment for apps that are not netns
 	 * aware, and execute a program in that environment.
 	 */
-	const char *name, *cmd;
-	char net_path[MAXPATHLEN];
-	int netns;
+	const char *cmd;
 
 	if (argc < 1) {
 		fprintf(stderr, "No netns name specified\n");
@@ -146,45 +81,10 @@ static int netns_exec(int argc, char **argv)
 		fprintf(stderr, "No command specified\n");
 		return -1;
 	}
-
-	name = argv[0];
 	cmd = argv[1];
-	snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
-	netns = open(net_path, O_RDONLY | O_CLOEXEC);
-	if (netns < 0) {
-		fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
-			name, strerror(errno));
-		return -1;
-	}
-
-	if (setns(netns, CLONE_NEWNET) < 0) {
-		fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
-			name, strerror(errno));
-		return -1;
-	}
 
-	if (unshare(CLONE_NEWNS) < 0) {
-		fprintf(stderr, "unshare failed: %s\n", strerror(errno));
-		return -1;
-	}
-	/* Don't let any mounts propagate back to the parent */
-	if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
-		fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
-			strerror(errno));
+	if (netns_switch(argv[0]))
 		return -1;
-	}
-	/* Mount a version of /sys that describes the network namespace */
-	if (umount2("/sys", MNT_DETACH) < 0) {
-		fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
-		return -1;
-	}
-	if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
-		fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
-		return -1;
-	}
-
-	/* Setup bind mounts for config files in /etc */
-	bind_etc(name);
 
 	fflush(stdout);
 
diff --git a/lib/Makefile b/lib/Makefile
index a42b885..66f89f1 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -1,8 +1,12 @@
 include ../Config
 
+ifeq ($(IP_CONFIG_SETNS),y)
+	CFLAGS += -DHAVE_SETNS
+endif
+
 CFLAGS += -fPIC
 
-UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o
+UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o namespace.o
 
 NLOBJ=libgenl.o ll_map.o libnetlink.o
 
diff --git a/lib/namespace.c b/lib/namespace.c
new file mode 100644
index 0000000..1554ce0
--- /dev/null
+++ b/lib/namespace.c
@@ -0,0 +1,86 @@
+/*
+ * namespace.c
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ */
+
+#include <fcntl.h>
+#include <dirent.h>
+
+#include "utils.h"
+#include "namespace.h"
+
+static void bind_etc(const char *name)
+{
+	char etc_netns_path[MAXPATHLEN];
+	char netns_name[MAXPATHLEN];
+	char etc_name[MAXPATHLEN];
+	struct dirent *entry;
+	DIR *dir;
+
+	snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
+	dir = opendir(etc_netns_path);
+	if (!dir)
+		return;
+
+	while ((entry = readdir(dir)) != NULL) {
+		if (strcmp(entry->d_name, ".") == 0)
+			continue;
+		if (strcmp(entry->d_name, "..") == 0)
+			continue;
+		snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
+		snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
+		if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
+			fprintf(stderr, "Bind %s -> %s failed: %s\n",
+				netns_name, etc_name, strerror(errno));
+		}
+	}
+	closedir(dir);
+}
+
+int netns_switch(char *name)
+{
+	char net_path[MAXPATHLEN];
+	int netns;
+
+	snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
+	netns = open(net_path, O_RDONLY | O_CLOEXEC);
+	if (netns < 0) {
+		fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
+			name, strerror(errno));
+		return -1;
+	}
+
+	if (setns(netns, CLONE_NEWNET) < 0) {
+		fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
+			name, strerror(errno));
+		return -1;
+	}
+
+	if (unshare(CLONE_NEWNS) < 0) {
+		fprintf(stderr, "unshare failed: %s\n", strerror(errno));
+		return -1;
+	}
+	/* Don't let any mounts propagate back to the parent */
+	if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
+		fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
+			strerror(errno));
+		return -1;
+	}
+	/* Mount a version of /sys that describes the network namespace */
+	if (umount2("/sys", MNT_DETACH) < 0) {
+		fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
+		return -1;
+	}
+	if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
+		fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
+		return -1;
+	}
+
+	/* Setup bind mounts for config files in /etc */
+	bind_etc(name);
+	return 0;
+}
-- 
2.1.3

^ permalink raw reply related

* [PATCH iproute2 0/4] Switch network ns w/o execvp for iproute2 tools
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
  To: netdev; +Cc: Vadim Kochan

This series adds new -n[etns] option to ip, tc & bridge tools which
allows to easy and faster switch to specified network namespace. So instead of:

    ip netns exec NETNS { ip | tc | bridge } OBJECT COMMAND

it will be possible do the same by:

    { ip | tc | bridge } -n[etns] NETNS OBJECT COMMAND

I skipped misc tools and will work on them later.

Vadim Kochan (4):
  lib: Add netns_switch func for change network namespace
  ip: Allow to easy change network namespace
  bridge: Allow to easy change network namespace
  tc: Allow to easy change network namespace

 bridge/Makefile     |   4 ++
 bridge/bridge.c     |   7 +++-
 include/namespace.h |  46 +++++++++++++++++++++++
 ip/ip.c             |   7 +++-
 ip/ipnetns.c        | 106 ++--------------------------------------------------
 lib/Makefile        |   6 ++-
 lib/namespace.c     |  86 ++++++++++++++++++++++++++++++++++++++++++
 man/man8/bridge.8   |  23 +++++++++++-
 man/man8/ip.8       |  23 +++++++++++-
 man/man8/tc.8       |  65 ++++++++++++++++++++++++--------
 tc/Makefile         |   5 +++
 tc/tc.c             |   8 +++-
 12 files changed, 262 insertions(+), 124 deletions(-)
 create mode 100644 include/namespace.h
 create mode 100644 lib/namespace.c

-- 
2.1.3

^ permalink raw reply

* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: Jiri Pirko @ 2014-12-13 15:20 UTC (permalink / raw)
  To: vadim4j; +Cc: netdev
In-Reply-To: <20141213133210.GA12291@angus-think.lan>

Sat, Dec 13, 2014 at 02:32:10PM CET, vadim4j@gmail.com wrote:
>On Sat, Dec 13, 2014 at 10:58:03AM +0200, vadim4j@gmail.com wrote:
>> On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
>> > On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
>> > > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
>> > > >From: Vadim Kochan <vadim4j@gmail.com>
>> > > >
>> > > >Added new '-netns' option to simplify executing following cmd:
>> > > >
>> > > >    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>> > > >
>> > > >    to
>> > > >
>> > > >    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>> > > >
>> > > >e.g.:
>> > > >
>> > > >    ip -net vnet0 link add br0 type bridge
>> > > >    ip -n vnet0 link
>> > > >
>> > > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
>> > > 
>> > > 
>> > > This looks good. I'm still missing support in tc, bridge, etc. I think
>> > > it would be great to do this in the same patch/patchset.
>> > > 
>> >  I planned to do this in the future patches after this main
>> >  changes will be accepted. Actually adding this option to other
>> >  tools is trivial.
>> > 
>> >  Anyway may be I will re-send v5 with supporting of these tools if I will have time.
>> > 
>> >  Regards,
>> 
>> BTW, some tools already have '-n' option, so I think only '-net' can be
>> used in such cases.


Yep, that is my point. I would like to have the same option for all.

>> 
>> Regards,
>
>OK, I am going to split changes into series of patches and bring new
>option to : ip, tc, and bridge tools.
>Regarding other misc tools - will do it later as I am  not very familiar with them.
>Are you OK with this Jiri ?

Yep. Thank you!

>
>Regards,

^ permalink raw reply

* RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Rosen, Rami @ 2014-12-13 14:39 UTC (permalink / raw)
  To: Varlese, Marco, Roopa Prabhu, Jiri Pirko
  Cc: John Fastabend, netdev@vger.kernel.org,
	stephen@networkplumber.org, Fastabend, John R, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC4609@IRSMSX108.ger.corp.intel.com>

Hi, all,

Regarding preferring using netlink sockets versus ethtool IOCTLs for setting kernel network attributes from userspace, I fully agree with Marco. The netlink API is much more structured and
much more geared towards this type of operation, than the IOCTL-based ethtool. 

Regards,
Rami Rosen
Software Engineer, Intel

-----Original Message-----
From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Varlese, Marco
Sent: Friday, December 12, 2014 11:20
To: Roopa Prabhu; Jiri Pirko
Cc: John Fastabend; netdev@vger.kernel.org; stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com; linux-kernel@vger.kernel.org
Subject: RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration

> -----Original Message-----
> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
> Sent: Thursday, December 11, 2014 5:41 PM
> To: Jiri Pirko
> Cc: Varlese, Marco; John Fastabend; netdev@vger.kernel.org; 
> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com; 
> linux-kernel@vger.kernel.org
> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port 
> configuration
> 
> On 12/11/14, 8:56 AM, Jiri Pirko wrote:
> > Thu, Dec 11, 2014 at 05:37:46PM CET, roopa@cumulusnetworks.com wrote:
> >> On 12/11/14, 3:01 AM, Jiri Pirko wrote:
> >>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
> >>>>> -----Original Message-----
> >>>>> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >>>>> Sent: Wednesday, December 10, 2014 5:04 PM
> >>>>> To: Jiri Pirko
> >>>>> Cc: Varlese, Marco; netdev@vger.kernel.org; 
> >>>>> stephen@networkplumber.org; Fastabend, John R; 
> >>>>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux- 
> >>>>> kernel@vger.kernel.org
> >>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch 
> >>>>> port configuration
> >>>>>
> >>>>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
> >>>>>> Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
> wrote:
> >>>>>>> From: Marco Varlese <marco.varlese@intel.com>
> >>>>>>>
> >>>>>>> Switch hardware offers a list of attributes that are 
> >>>>>>> configurable on a per port basis.
> >>>>>>> This patch provides a mechanism to configure switch ports by 
> >>>>>>> adding an NDO for setting specific values to specific attributes.
> >>>>>>> There will be a separate patch that extends iproute2 to call 
> >>>>>>> the new NDO.
> >>>>>> What are these attributes? Can you give some examples. I'm 
> >>>>>> asking because there is a plan to pass generic attributes to 
> >>>>>> switch ports replacing current specific 
> >>>>>> ndo_switch_port_stp_update. In this case, bridge is setting that attribute.
> >>>>>>
> >>>>>> Is there need to set something directly from userspace or does 
> >>>>>> it make rather sense to use involved bridge/ovs/bond ? I think 
> >>>>>> that both will be needed.
> >>>>> +1
> >>>>>
> >>>>> I think for many attributes it would be best to have both. The 
> >>>>> in kernel callers and netlink userspace can use the same driver ndo_ops.
> >>>>>
> >>>>> But then we don't _require_ any specific bridge/ovs/etc module.
> >>>>> And we may have some attributes that are not specific to any 
> >>>>> existing software module. I'm guessing Marco has some examples 
> >>>>> of
> these.
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>
> >>>>> --
> >>>>> John Fastabend         Intel Corporation
> >>>> We do have a need to configure the attributes directly from 
> >>>> user-space
> and I have identified the tool to do that in iproute2.
> >>>>
> >>>> An example of attributes are:
> >>>> * enabling/disabling of learning of source addresses on a given 
> >>>> port (you can imagine the attribute called LEARNING for example);
> >>>> * internal loopback control (i.e. LOOPBACK) which will control 
> >>>> how the flow of traffic behaves from the switch fabric towards an 
> >>>> egress port;
> >>>> * flooding for broadcast/multicast/unicast type of packets (i.e.
> >>>> BFLOODING, MFLOODING, UFLOODING);
> >>>>
> >>>> Some attributes would be of the type enabled/disabled while other 
> >>>> will
> allow specific values to allow the user to configure different 
> behaviours of that feature on that particular port on that platform.
> >>>>
> >>>> One thing to mention - as John stated as well - there might be 
> >>>> some
> attributes that are not specific to any software module but rather 
> have to do with the actual hardware/platform to configure.
> >>>>
> >>>> I hope this clarifies some points.
> >>> It does. Makes sense. We need to expose this attr set/get for both 
> >>> in-kernel and userspace use cases.
> >>>
> >>> Please adjust you patch for this. Also, as a second patch, it 
> >>> would be great if you can convert ndo_switch_port_stp_update to 
> >>> this new
> ndo.
> >> Why are we exposing generic switch attribute get/set from userspace 
> >> ?. We already have specific attributes for learning/flooding which 
> >> can be extended further.
> > Yes, but that is for PF_BRIDGE and bridge specific attributes. There 
> > might be another generic attrs, no?
> I cant think of any. And plus, the whole point of switchdev l2 
> offloads was to map these to existing bridge attributes. And we 
> already have a match for some of the attributes that marco wants.
> 
> If there is a need for such attributes, i don't see why it is needed 
> for switch devices only.
> It is needed for any hw (nics etc). And, a precedence to this is to do 
> it via ethtool.
> 
> Having said that, am sure we will find a need for this in the future.
> And having a netlink attribute always helps.
> 
> Today, it seems like these can be mapped to existing attributes that 
> are settable via ndo_bridge_setlink/getlink.
> 
> >
> >> And for in kernel api....i had a sample patch in my RFC series 
> >> (Which i was going to resubmit, until it was decided that we will 
> >> use existing api around
> >> ndo_bridge_setlink/ndo_bridge_getlink):
> >> http://www.spinics.net/lists/netdev/msg305473.html
> > Yes, this might become handy for other generic non-bridge attrs.
> >
> >> Thanks,
> >> Roopa
> >>
> >>
> >>

The list I provided is only a subset of the attributes we will need to be exposed. I do have more and I'm sure that more will come in the future. As I mentioned in few posts earlier, some attributes are generic and some are not. 

I did not consider ethtool for few reasons but the main one is that I was under the impression that netlink was preferred in many circumstances over the ethotool_ops. Plus, all the cases I have identified so far are going to nicely fit into the setlink set of operations.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply

* 2014/2015 GOOGLE WINNER...CODE: 2944/23D/334,04
From: GOOGLE LOTTO @ 2014-12-13 13:36 UTC (permalink / raw)
  To: netdev


Sie sind einer unserer glÑŒcklichen Gewinner in der laufenden 2014/2015 GOOGLE LOTTO fÑŒr Ihre kontinuierliche Nutzung unserer Dienste.
Ьberprьfen Sie legen Dokument fьr weitere Details.
........................................................
Your are one of our lucky winners in the ongoing 2014/2015 GOOGLE LOTTO for your continual usage of our services.

Check attach document for more details.

Congratulations from the Staff & Members of Google Incorporated.

Regards,
Dr. Larry Page.
Chairman of the Board and Chief Executive Officer, Google Inc.

^ permalink raw reply

* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13 13:32 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213085803.GA12446@angus-think.lan>

On Sat, Dec 13, 2014 at 10:58:03AM +0200, vadim4j@gmail.com wrote:
> On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
> > On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> > > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> > > >From: Vadim Kochan <vadim4j@gmail.com>
> > > >
> > > >Added new '-netns' option to simplify executing following cmd:
> > > >
> > > >    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> > > >
> > > >    to
> > > >
> > > >    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> > > >
> > > >e.g.:
> > > >
> > > >    ip -net vnet0 link add br0 type bridge
> > > >    ip -n vnet0 link
> > > >
> > > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> > > 
> > > 
> > > This looks good. I'm still missing support in tc, bridge, etc. I think
> > > it would be great to do this in the same patch/patchset.
> > > 
> >  I planned to do this in the future patches after this main
> >  changes will be accepted. Actually adding this option to other
> >  tools is trivial.
> > 
> >  Anyway may be I will re-send v5 with supporting of these tools if I will have time.
> > 
> >  Regards,
> 
> BTW, some tools already have '-n' option, so I think only '-net' can be
> used in such cases.
> 
> Regards,

OK, I am going to split changes into series of patches and bring new
option to : ip, tc, and bridge tools.
Regarding other misc tools - will do it later as I am  not very familiar with them.
Are you OK with this Jiri ?

Regards,

^ permalink raw reply

* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13  8:58 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213084243.GA3284@angus-think.lan>

On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
> On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> > >From: Vadim Kochan <vadim4j@gmail.com>
> > >
> > >Added new '-netns' option to simplify executing following cmd:
> > >
> > >    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> > >
> > >    to
> > >
> > >    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> > >
> > >e.g.:
> > >
> > >    ip -net vnet0 link add br0 type bridge
> > >    ip -n vnet0 link
> > >
> > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> > 
> > 
> > This looks good. I'm still missing support in tc, bridge, etc. I think
> > it would be great to do this in the same patch/patchset.
> > 
>  I planned to do this in the future patches after this main
>  changes will be accepted. Actually adding this option to other
>  tools is trivial.
> 
>  Anyway may be I will re-send v5 with supporting of these tools if I will have time.
> 
>  Regards,

BTW, some tools already have '-n' option, so I think only '-net' can be
used in such cases.

Regards,

^ permalink raw reply

* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13  8:42 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213082936.GA1849@nanopsycho.orion>

On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> >From: Vadim Kochan <vadim4j@gmail.com>
> >
> >Added new '-netns' option to simplify executing following cmd:
> >
> >    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> >
> >    to
> >
> >    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> >
> >e.g.:
> >
> >    ip -net vnet0 link add br0 type bridge
> >    ip -n vnet0 link
> >
> >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> 
> 
> This looks good. I'm still missing support in tc, bridge, etc. I think
> it would be great to do this in the same patch/patchset.
> 
 I planned to do this in the future patches after this main
 changes will be accepted. Actually adding this option to other
 tools is trivial.

 Anyway may be I will re-send v5 with supporting of these tools if I will have time.

 Regards,

^ permalink raw reply

* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: Jiri Pirko @ 2014-12-13  8:29 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418422507-6635-1-git-send-email-vadim4j@gmail.com>

Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
>    ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>
>    to
>
>    ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
>    ip -net vnet0 link add br0 type bridge
>    ip -n vnet0 link
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>


This looks good. I'm still missing support in tc, bridge, etc. I think
it would be great to do this in the same patch/patchset.


>---
> include/namespace.h |  46 +++++++++++++++++++++++
> ip/ip.c             |   5 +++
> ip/ipnetns.c        | 106 ++--------------------------------------------------
> lib/Makefile        |   6 ++-
> lib/namespace.c     |  86 ++++++++++++++++++++++++++++++++++++++++++
> man/man8/ip.8       |  23 +++++++++++-
> 6 files changed, 167 insertions(+), 105 deletions(-)
> create mode 100644 include/namespace.h
> create mode 100644 lib/namespace.c
>
>diff --git a/include/namespace.h b/include/namespace.h
>new file mode 100644
>index 0000000..2f13e65
>--- /dev/null
>+++ b/include/namespace.h
>@@ -0,0 +1,46 @@
>+#ifndef __NAMESPACE_H__
>+#define __NAMESPACE_H__ 1
>+
>+#include <sched.h>
>+#include <sys/mount.h>
>+#include <errno.h>
>+
>+#define NETNS_RUN_DIR "/var/run/netns"
>+#define NETNS_ETC_DIR "/etc/netns"
>+
>+#ifndef CLONE_NEWNET
>+#define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */
>+#endif
>+
>+#ifndef MNT_DETACH
>+#define MNT_DETACH	0x00000002	/* Just detach from the tree */
>+#endif /* MNT_DETACH */
>+
>+/* sys/mount.h may be out too old to have these */
>+#ifndef MS_REC
>+#define MS_REC		16384
>+#endif
>+
>+#ifndef MS_SLAVE
>+#define MS_SLAVE	(1 << 19)
>+#endif
>+
>+#ifndef MS_SHARED
>+#define MS_SHARED	(1 << 20)
>+#endif
>+
>+#ifndef HAVE_SETNS
>+static int setns(int fd, int nstype)
>+{
>+#ifdef __NR_setns
>+	return syscall(__NR_setns, fd, nstype);
>+#else
>+	errno = ENOSYS;
>+	return -1;
>+#endif
>+}
>+#endif /* HAVE_SETNS */
>+
>+extern int netns_switch(char *netns);
>+
>+#endif /* __NAMESPACE_H__ */
>diff --git a/ip/ip.c b/ip/ip.c
>index 5f759d5..d6c9123 100644
>--- a/ip/ip.c
>+++ b/ip/ip.c
>@@ -22,6 +22,7 @@
> #include "SNAPSHOT.h"
> #include "utils.h"
> #include "ip_common.h"
>+#include "namespace.h"
> 
> int preferred_family = AF_UNSPEC;
> int human_readable = 0;
>@@ -262,6 +263,10 @@ int main(int argc, char **argv)
> 			rcvbuf = size;
> 		} else if (matches(opt, "-help") == 0) {
> 			usage();
>+		} else if (matches(opt, "-netns") == 0) {
>+			NEXT_ARG();
>+			if (netns_switch(argv[1]))
>+				exit(-1);
> 		} else {
> 			fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
> 			exit(-1);
>diff --git a/ip/ipnetns.c b/ip/ipnetns.c
>index 1c8aa02..519d518 100644
>--- a/ip/ipnetns.c
>+++ b/ip/ipnetns.c
>@@ -17,42 +17,7 @@
> 
> #include "utils.h"
> #include "ip_common.h"
>-
>-#define NETNS_RUN_DIR "/var/run/netns"
>-#define NETNS_ETC_DIR "/etc/netns"
>-
>-#ifndef CLONE_NEWNET
>-#define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */
>-#endif
>-
>-#ifndef MNT_DETACH
>-#define MNT_DETACH	0x00000002	/* Just detach from the tree */
>-#endif /* MNT_DETACH */
>-
>-/* sys/mount.h may be out too old to have these */
>-#ifndef MS_REC
>-#define MS_REC		16384
>-#endif
>-
>-#ifndef MS_SLAVE
>-#define MS_SLAVE	(1 << 19)
>-#endif
>-
>-#ifndef MS_SHARED
>-#define MS_SHARED	(1 << 20)
>-#endif
>-
>-#ifndef HAVE_SETNS
>-static int setns(int fd, int nstype)
>-{
>-#ifdef __NR_setns
>-	return syscall(__NR_setns, fd, nstype);
>-#else
>-	errno = ENOSYS;
>-	return -1;
>-#endif
>-}
>-#endif /* HAVE_SETNS */
>+#include "namespace.h"
> 
> static int usage(void)
> {
>@@ -101,42 +66,12 @@ static int netns_list(int argc, char **argv)
> 	return 0;
> }
> 
>-static void bind_etc(const char *name)
>-{
>-	char etc_netns_path[MAXPATHLEN];
>-	char netns_name[MAXPATHLEN];
>-	char etc_name[MAXPATHLEN];
>-	struct dirent *entry;
>-	DIR *dir;
>-
>-	snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
>-	dir = opendir(etc_netns_path);
>-	if (!dir)
>-		return;
>-
>-	while ((entry = readdir(dir)) != NULL) {
>-		if (strcmp(entry->d_name, ".") == 0)
>-			continue;
>-		if (strcmp(entry->d_name, "..") == 0)
>-			continue;
>-		snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
>-		snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
>-		if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
>-			fprintf(stderr, "Bind %s -> %s failed: %s\n",
>-				netns_name, etc_name, strerror(errno));
>-		}
>-	}
>-	closedir(dir);
>-}
>-
> static int netns_exec(int argc, char **argv)
> {
> 	/* Setup the proper environment for apps that are not netns
> 	 * aware, and execute a program in that environment.
> 	 */
>-	const char *name, *cmd;
>-	char net_path[MAXPATHLEN];
>-	int netns;
>+	const char *cmd;
> 
> 	if (argc < 1) {
> 		fprintf(stderr, "No netns name specified\n");
>@@ -146,45 +81,10 @@ static int netns_exec(int argc, char **argv)
> 		fprintf(stderr, "No command specified\n");
> 		return -1;
> 	}
>-
>-	name = argv[0];
> 	cmd = argv[1];
>-	snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
>-	netns = open(net_path, O_RDONLY | O_CLOEXEC);
>-	if (netns < 0) {
>-		fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
>-			name, strerror(errno));
>-		return -1;
>-	}
>-
>-	if (setns(netns, CLONE_NEWNET) < 0) {
>-		fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
>-			name, strerror(errno));
>-		return -1;
>-	}
> 
>-	if (unshare(CLONE_NEWNS) < 0) {
>-		fprintf(stderr, "unshare failed: %s\n", strerror(errno));
>-		return -1;
>-	}
>-	/* Don't let any mounts propagate back to the parent */
>-	if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
>-		fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
>-			strerror(errno));
>+	if (netns_switch(argv[0]))
> 		return -1;
>-	}
>-	/* Mount a version of /sys that describes the network namespace */
>-	if (umount2("/sys", MNT_DETACH) < 0) {
>-		fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
>-		return -1;
>-	}
>-	if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
>-		fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
>-		return -1;
>-	}
>-
>-	/* Setup bind mounts for config files in /etc */
>-	bind_etc(name);
> 
> 	fflush(stdout);
> 
>diff --git a/lib/Makefile b/lib/Makefile
>index a42b885..66f89f1 100644
>--- a/lib/Makefile
>+++ b/lib/Makefile
>@@ -1,8 +1,12 @@
> include ../Config
> 
>+ifeq ($(IP_CONFIG_SETNS),y)
>+	CFLAGS += -DHAVE_SETNS
>+endif
>+
> CFLAGS += -fPIC
> 
>-UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o
>+UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o namespace.o
> 
> NLOBJ=libgenl.o ll_map.o libnetlink.o
> 
>diff --git a/lib/namespace.c b/lib/namespace.c
>new file mode 100644
>index 0000000..1554ce0
>--- /dev/null
>+++ b/lib/namespace.c
>@@ -0,0 +1,86 @@
>+/*
>+ * namespace.c
>+ *
>+ *		This program is free software; you can redistribute it and/or
>+ *		modify it under the terms of the GNU General Public License
>+ *		as published by the Free Software Foundation; either version
>+ *		2 of the License, or (at your option) any later version.
>+ */
>+
>+#include <fcntl.h>
>+#include <dirent.h>
>+
>+#include "utils.h"
>+#include "namespace.h"
>+
>+static void bind_etc(const char *name)
>+{
>+	char etc_netns_path[MAXPATHLEN];
>+	char netns_name[MAXPATHLEN];
>+	char etc_name[MAXPATHLEN];
>+	struct dirent *entry;
>+	DIR *dir;
>+
>+	snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
>+	dir = opendir(etc_netns_path);
>+	if (!dir)
>+		return;
>+
>+	while ((entry = readdir(dir)) != NULL) {
>+		if (strcmp(entry->d_name, ".") == 0)
>+			continue;
>+		if (strcmp(entry->d_name, "..") == 0)
>+			continue;
>+		snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
>+		snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
>+		if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
>+			fprintf(stderr, "Bind %s -> %s failed: %s\n",
>+				netns_name, etc_name, strerror(errno));
>+		}
>+	}
>+	closedir(dir);
>+}
>+
>+int netns_switch(char *name)
>+{
>+	char net_path[MAXPATHLEN];
>+	int netns;
>+
>+	snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
>+	netns = open(net_path, O_RDONLY | O_CLOEXEC);
>+	if (netns < 0) {
>+		fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
>+			name, strerror(errno));
>+		return -1;
>+	}
>+
>+	if (setns(netns, CLONE_NEWNET) < 0) {
>+		fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
>+			name, strerror(errno));
>+		return -1;
>+	}
>+
>+	if (unshare(CLONE_NEWNS) < 0) {
>+		fprintf(stderr, "unshare failed: %s\n", strerror(errno));
>+		return -1;
>+	}
>+	/* Don't let any mounts propagate back to the parent */
>+	if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
>+		fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
>+			strerror(errno));
>+		return -1;
>+	}
>+	/* Mount a version of /sys that describes the network namespace */
>+	if (umount2("/sys", MNT_DETACH) < 0) {
>+		fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
>+		return -1;
>+	}
>+	if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
>+		fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
>+		return -1;
>+	}
>+
>+	/* Setup bind mounts for config files in /etc */
>+	bind_etc(name);
>+	return 0;
>+}
>diff --git a/man/man8/ip.8 b/man/man8/ip.8
>index 2d42e98..0fb759d 100644
>--- a/man/man8/ip.8
>+++ b/man/man8/ip.8
>@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
> \fB\-r\fR[\fIesolve\fR] |
> \fB\-f\fR[\fIamily\fR] {
> .BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
>-\fB\-o\fR[\fIneline\fR] }
>+\fB\-o\fR[\fIneline\fR] |
>+\fB\-n\fR[\fIetns\fR] }
> 
> 
> .SH OPTIONS
>@@ -134,6 +135,26 @@ the output.
> use the system's name resolver to print DNS names instead of
> host addresses.
> 
>+.TP
>+.BR "\-n" , " \-net" , " \-netns " <NETNS>
>+switches
>+.B ip
>+to the specified network namespace
>+.IR NETNS .
>+Actually it just simplifies executing of:
>+
>+.B ip netns exec
>+.IR NETNS
>+.B ip
>+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
>+to
>+
>+.B ip
>+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
> .SH IP - COMMAND SYNTAX
> 
> .SS
>-- 
>2.1.3
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Roopa Prabhu @ 2014-12-13  7:06 UTC (permalink / raw)
  To: Varlese, Marco
  Cc: Jiri Pirko, John Fastabend, netdev@vger.kernel.org,
	stephen@networkplumber.org, Fastabend, John R, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC4609@IRSMSX108.ger.corp.intel.com>

On 12/12/14, 1:19 AM, Varlese, Marco wrote:
>> -----Original Message-----
>> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
>> Sent: Thursday, December 11, 2014 5:41 PM
>> To: Jiri Pirko
>> Cc: Varlese, Marco; John Fastabend; netdev@vger.kernel.org;
>> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com;
>> linux-kernel@vger.kernel.org
>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> configuration
>>
>> On 12/11/14, 8:56 AM, Jiri Pirko wrote:
>>> Thu, Dec 11, 2014 at 05:37:46PM CET, roopa@cumulusnetworks.com wrote:
>>>> On 12/11/14, 3:01 AM, Jiri Pirko wrote:
>>>>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>>>>>>> Sent: Wednesday, December 10, 2014 5:04 PM
>>>>>>> To: Jiri Pirko
>>>>>>> Cc: Varlese, Marco; netdev@vger.kernel.org;
>>>>>>> stephen@networkplumber.org; Fastabend, John R;
>>>>>>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>>>>>>> kernel@vger.kernel.org
>>>>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>>>>>>> configuration
>>>>>>>
>>>>>>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
>>>>>>>> Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
>> wrote:
>>>>>>>>> From: Marco Varlese <marco.varlese@intel.com>
>>>>>>>>>
>>>>>>>>> Switch hardware offers a list of attributes that are
>>>>>>>>> configurable on a per port basis.
>>>>>>>>> This patch provides a mechanism to configure switch ports by
>>>>>>>>> adding an NDO for setting specific values to specific attributes.
>>>>>>>>> There will be a separate patch that extends iproute2 to call the
>>>>>>>>> new NDO.
>>>>>>>> What are these attributes? Can you give some examples. I'm asking
>>>>>>>> because there is a plan to pass generic attributes to switch
>>>>>>>> ports replacing current specific ndo_switch_port_stp_update. In
>>>>>>>> this case, bridge is setting that attribute.
>>>>>>>>
>>>>>>>> Is there need to set something directly from userspace or does it
>>>>>>>> make rather sense to use involved bridge/ovs/bond ? I think that
>>>>>>>> both will be needed.
>>>>>>> +1
>>>>>>>
>>>>>>> I think for many attributes it would be best to have both. The in
>>>>>>> kernel callers and netlink userspace can use the same driver ndo_ops.
>>>>>>>
>>>>>>> But then we don't _require_ any specific bridge/ovs/etc module.
>>>>>>> And we may have some attributes that are not specific to any
>>>>>>> existing software module. I'm guessing Marco has some examples of
>> these.
>>>>>>> [...]
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> John Fastabend         Intel Corporation
>>>>>> We do have a need to configure the attributes directly from user-space
>> and I have identified the tool to do that in iproute2.
>>>>>> An example of attributes are:
>>>>>> * enabling/disabling of learning of source addresses on a given
>>>>>> port (you can imagine the attribute called LEARNING for example);
>>>>>> * internal loopback control (i.e. LOOPBACK) which will control how
>>>>>> the flow of traffic behaves from the switch fabric towards an
>>>>>> egress port;
>>>>>> * flooding for broadcast/multicast/unicast type of packets (i.e.
>>>>>> BFLOODING, MFLOODING, UFLOODING);
>>>>>>
>>>>>> Some attributes would be of the type enabled/disabled while other will
>> allow specific values to allow the user to configure different behaviours of
>> that feature on that particular port on that platform.
>>>>>> One thing to mention - as John stated as well - there might be some
>> attributes that are not specific to any software module but rather have to do
>> with the actual hardware/platform to configure.
>>>>>> I hope this clarifies some points.
>>>>> It does. Makes sense. We need to expose this attr set/get for both
>>>>> in-kernel and userspace use cases.
>>>>>
>>>>> Please adjust you patch for this. Also, as a second patch, it would
>>>>> be great if you can convert ndo_switch_port_stp_update to this new
>> ndo.
>>>> Why are we exposing generic switch attribute get/set from userspace
>>>> ?. We already have specific attributes for learning/flooding which
>>>> can be extended further.
>>> Yes, but that is for PF_BRIDGE and bridge specific attributes. There
>>> might be another generic attrs, no?
>> I cant think of any. And plus, the whole point of switchdev l2 offloads was to
>> map these to existing bridge attributes. And we already have a match for
>> some of the attributes that marco wants.
>>
>> If there is a need for such attributes, i don't see why it is needed for switch
>> devices only.
>> It is needed for any hw (nics etc). And, a precedence to this is to do it via
>> ethtool.
>>
>> Having said that, am sure we will find a need for this in the future.
>> And having a netlink attribute always helps.
>>
>> Today, it seems like these can be mapped to existing attributes that are
>> settable via ndo_bridge_setlink/getlink.
>>
>>>> And for in kernel api....i had a sample patch in my RFC series (Which
>>>> i was going to resubmit, until it was decided that we will use
>>>> existing api around
>>>> ndo_bridge_setlink/ndo_bridge_getlink):
>>>> http://www.spinics.net/lists/netdev/msg305473.html
>>> Yes, this might become handy for other generic non-bridge attrs.
>>>
>>>> Thanks,
>>>> Roopa
>>>>
>>>>
>>>>
> The list I provided is only a subset of the attributes we will need to be exposed. I do have more and I'm sure that more will come in the future. As I mentioned in few posts earlier, some attributes are generic and some are not.
>
> I did not consider ethtool for few reasons but the main one is that I was under the impression that netlink was preferred in many circumstances over the ethotool_ops.
That is correct. I don't think anybody hinted that you should extend 
ethtool.
>   Plus, all the cases I have identified so far are going to nicely fit into the setlink set of operations.
>

Would be better if you submitted your iproute2 patch with this patch.

I do plan to resubmit my generic ndo patch soon.

Thanks,
Roopa

^ permalink raw reply

* Re: [PATCH v2 0/6] net-PPP: Deletion of a few unnecessary checks
From: SF Markus Elfring @ 2014-12-13  6:17 UTC (permalink / raw)
  To: David Miller
  Cc: Sergei Shtylyov, Paul Mackerras, linux-ppp, netdev, Eric Dumazet,
	linux-kernel, kernel-janitors, Julia Lawall
In-Reply-To: <20141212.150741.2169710971698369167.davem@davemloft.net>

> I'd like to honestly ask why you are being so difficult?

There are several factors which contribute to your perception of
difficulty here.

1. I try to extract from every feedback the information about the amount
of acceptance or rejection for a specific update suggestion.
   A terse feedback (like yours for this issue) makes it occasionally
harder to see the next useful steps. So another constructive discussion
is evolving around the clarification of some implementation details.

2. I prefer also different communication styles at some points.

3. I reached a point where the desired software updates were not
immediately obvious for me while other contributors might have achieved
a better understanding for the affected issues already.

4. I am on the way at the moment to get my Linux software development
system running again.
  
https://forums.opensuse.org/showthread.php/503327-System-startup-does-not-continue-after-hard-disk-detection


> Everyone gets their code reviewed, everyone has to modify their
> changes to adhere to the subsystem maintainer's wishes.

That is fine as usual.


> You are not being treated specially, and quite frankly nobody
> is asking anything unreasonable of you.

That is also true as the software development process will be continued.

Regards,
Markus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox