* Re: Multicast packets being lost (3.10 stable)
From: David Miller @ 2014-12-13 20:37 UTC (permalink / raw)
To: linus.luessing; +Cc: openwrt-devel, netdev, bridge, gregkh, shemming
In-Reply-To: <20141210191633.GA2473@odroid>
From: Linus Lüssing <linus.luessing@c0d3.blue>
Date: Wed, 10 Dec 2014 20:16:33 +0100
> did you have a chance to look into backporting these fixes for
> stable yet?
I am not submitting -stable fixes back to 3.10 any longer, at most
I am doing 4 -stable releases and right now that is 3.18, 3.17,
v3.14, and v3.12
^ permalink raw reply
* Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
From: Julian Anastasov @ 2014-12-13 20:19 UTC (permalink / raw)
To: Smart Weblications GmbH - Florian Wiessner
Cc: Steffen Klassert, netdev, LKML, stable, Simon Horman, lvs-devel
In-Reply-To: <5489A465.6090305@smart-weblications.de>
Hello,
On Thu, 11 Dec 2014, Smart Weblications GmbH - Florian Wiessner wrote:
> >> [ 512.485323] CPU: 4 PID: 28142 Comm: vsftpd Not tainted 3.12.33 #5
> >
> > Above "#5" is same as previous oops. It means kernel
> > is not updated. Or you updated only the IPVS modules after
> > applying the both patches?
>
> I did it with make-kpkg --initrd linux_image which only rebuilt the modules,
> correct. I can retry with make clean before building the package
I just tested PASV and PORT with 3.12.33 including
both patches (seq adj fix + ip_route_me_harder fix) and do not
see any crashes in nf_ct_seqadj_set. If you still have problem
with FTP send me more info offlist.
> > You can also try without FTP tests to see if there
> > are oopses in xfrm, so that we can close this topic and then
> > to continue for the FTP problem on IPVS lists without
> > bothering non-IPVS people.
> >
>
> yeah, it seems that the xfrm issue is away.
Thanks for the confirmation!
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply
* [PATCH iproute2 4/4] tc: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
Added new '-netns' option to simplify executing following cmd:
ip netns exec NETNS tc OPTIONS COMMAND OBJECT
to
tc -n[etns] NETNS OPTIONS COMMAND OBJECT
e.g.:
tc -net vnet0 qdisc
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
man/man8/tc.8 | 65 +++++++++++++++++++++++++++++++++++++++++++++--------------
tc/Makefile | 5 +++++
tc/tc.c | 8 +++++++-
3 files changed, 62 insertions(+), 16 deletions(-)
diff --git a/man/man8/tc.8 b/man/man8/tc.8
index 8d794de..d8f974f 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -2,7 +2,9 @@
.SH NAME
tc \- show / manipulate traffic control settings
.SH SYNOPSIS
-.B tc qdisc [ add | change | replace | link | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B qdisc [ add | change | replace | link | delete ] dev
DEV
.B
[ parent
@@ -13,7 +15,9 @@ qdisc-id ] qdisc
[ qdisc specific parameters ]
.P
-.B tc class [ add | change | replace | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B class [ add | change | replace | delete ] dev
DEV
.B parent
qdisc-id
@@ -22,7 +26,9 @@ class-id ] qdisc
[ qdisc specific parameters ]
.P
-.B tc filter [ add | change | replace | delete ] dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B filter [ add | change | replace | delete ] dev
DEV
.B [ parent
qdisc-id
@@ -35,21 +41,28 @@ priority filtertype
flow-id
.B tc
+.RI "[ " OPTIONS " ]"
.RI "[ " FORMAT " ]"
.B qdisc show [ dev
DEV
.B ]
.P
.B tc
+.RI "[ " OPTIONS " ]"
.RI "[ " FORMAT " ]"
.B class show dev
DEV
.P
-.B tc filter show dev
+.B tc
+.RI "[ " OPTIONS " ]"
+.B filter show dev
DEV
.P
-.B tc [ -force ] -b\fR[\fIatch\fR] \fB[ filename ]
+.ti 8
+.IR OPTIONS " := {"
+\fB[ -force ] -b\fR[\fIatch\fR] \fB[ filename ] \fR|
+\fB[ \fB-n\fR[\fIetns\fR] name \fB] \fR}
.ti 8
.IR FORMAT " := {"
@@ -407,6 +420,38 @@ link
Only available for qdiscs and performs a replace where the node
must exist already.
+.SH OPTIONS
+
+.TP
+.BR "\-b", " \-b filename", " \-batch", " \-batch filename"
+read commands from provided file or standard input and invoke them.
+First failure will cause termination of tc.
+
+.TP
+.BR "\-force"
+don't terminate tc on errors in batch mode.
+If there were any errors during execution of the commands, the application return code will be non zero.
+
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B tc
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B tc
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B tc
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
.SH FORMAT
The show command has additional formatting options:
@@ -430,16 +475,6 @@ decode filter offset and mask values to equivalent filter commands based on TCP/
.BR "\-iec"
print rates in IEC units (ie. 1K = 1024).
-.TP
-.BR "\-b", " \-b filename", " \-batch", " \-batch filename"
-read commands from provided file or standard input and invoke them.
-First failure will cause termination of tc.
-
-.TP
-.BR "\-force"
-don't terminate tc on errors in batch mode.
-If there were any errors during execution of the commands, the application return code will be non zero.
-
.SH HISTORY
.B tc
was written by Alexey N. Kuznetsov and added in Linux 2.2.
diff --git a/tc/Makefile b/tc/Makefile
index 1ab36c6..536ed88 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -3,6 +3,11 @@ TCOBJ= tc.o tc_qdisc.o tc_class.o tc_filter.o tc_util.o \
m_ematch.o emp_ematch.yacc.o emp_ematch.lex.o
include ../Config
+
+ifeq ($(IP_CONFIG_SETNS),y)
+ CFLAGS += -DHAVE_SETNS
+endif
+
SHARED_LIBS ?= y
TCMODULES :=
diff --git a/tc/tc.c b/tc/tc.c
index 9b50e74..ea4ba10 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -29,6 +29,7 @@
#include "utils.h"
#include "tc_util.h"
#include "tc_common.h"
+#include "namespace.h"
int show_stats = 0;
int show_details = 0;
@@ -185,7 +186,8 @@ static void usage(void)
fprintf(stderr, "Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }\n"
" tc [-force] -batch filename\n"
"where OBJECT := { qdisc | class | filter | action | monitor }\n"
- " OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] }\n");
+ " OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | "
+ "-n[etns] name }\n");
}
static int do_cmd(int argc, char **argv)
@@ -293,6 +295,10 @@ int main(int argc, char **argv)
if (argc <= 1)
usage();
batch_file = argv[1];
+ } else if (matches(argv[1], "-netns") == 0) {
+ NEXT_ARG();
+ if (netns_switch(argv[1]))
+ return -1;
} else {
fprintf(stderr, "Option \"%s\" is unknown, try \"tc -help\".\n", argv[1]);
return -1;
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 3/4] bridge: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
Added new '-netns' option to simplify executing following cmd:
ip netns exec NETNS bridge OPTIONS COMMAND OBJECT
to
bridge -n[etns] NETNS OPTIONS COMMAND OBJECT
e.g.:
bridge -net vnet0 fdb
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
bridge/Makefile | 4 ++++
bridge/bridge.c | 7 ++++++-
man/man8/bridge.8 | 23 ++++++++++++++++++++++-
3 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/bridge/Makefile b/bridge/Makefile
index 1fb8320..9800753 100644
--- a/bridge/Makefile
+++ b/bridge/Makefile
@@ -2,6 +2,10 @@ BROBJ = bridge.o fdb.o monitor.o link.o mdb.o vlan.o
include ../Config
+ifeq ($(IP_CONFIG_SETNS),y)
+ CFLAGS += -DHAVE_SETNS
+endif
+
all: bridge
bridge: $(BROBJ) $(LIBNETLINK)
diff --git a/bridge/bridge.c b/bridge/bridge.c
index ee08f90..5fcc552 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -13,6 +13,7 @@
#include "SNAPSHOT.h"
#include "utils.h"
#include "br_common.h"
+#include "namespace.h"
struct rtnl_handle rth = { .fd = -1 };
int preferred_family = AF_UNSPEC;
@@ -31,7 +32,7 @@ static void usage(void)
"Usage: bridge [ OPTIONS ] OBJECT { COMMAND | help }\n"
"where OBJECT := { link | fdb | mdb | vlan | monitor }\n"
" OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] |\n"
-" -o[neline] | -t[imestamp] \n");
+" -o[neline] | -t[imestamp] | -n[etns] name }\n");
exit(-1);
}
@@ -112,6 +113,10 @@ main(int argc, char **argv)
preferred_family = AF_INET;
} else if (strcmp(opt, "-6") == 0) {
preferred_family = AF_INET6;
+ } else if (matches(opt, "-netns") == 0) {
+ NEXT_ARG();
+ if (netns_switch(argv[1]))
+ exit(-1);
} else {
fprintf(stderr, "Option \"%s\" is unknown, try \"bridge help\".\n", opt);
exit(-1);
diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index af31d41..cb3fb46 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -19,7 +19,8 @@ bridge \- show / manipulate bridge addresses and devices
.ti -8
.IR OPTIONS " := { "
\fB\-V\fR[\fIersion\fR] |
-\fB\-s\fR[\fItatistics\fR] }
+\fB\-s\fR[\fItatistics\fR] |
+\fB\-n\fR[\fIetns\fR] name }
.ti -8
.BR "bridge link set"
@@ -112,6 +113,26 @@ output more information. If this option
is given multiple times, the amount of information increases.
As a rule, the information is statistics or some time values.
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B bridge
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B bridge
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B bridge
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
.SH BRIDGE - COMMAND SYNTAX
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 2/4] ip: Allow to easy change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
Added new '-netns' option to simplify executing following cmd:
ip netns exec NETNS ip OPTIONS COMMAND OBJECT
to
ip -n[etns] NETNS OPTIONS COMMAND OBJECT
e.g.:
ip -net vnet0 link add br0 type bridge
ip -n vnet0 link
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
ip/ip.c | 7 ++++++-
man/man8/ip.8 | 23 ++++++++++++++++++++++-
2 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/ip/ip.c b/ip/ip.c
index 5f759d5..96e64a3 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -22,6 +22,7 @@
#include "SNAPSHOT.h"
#include "utils.h"
#include "ip_common.h"
+#include "namespace.h"
int preferred_family = AF_UNSPEC;
int human_readable = 0;
@@ -54,7 +55,7 @@ static void usage(void)
" -4 | -6 | -I | -D | -B | -0 |\n"
" -l[oops] { maximum-addr-flush-attempts } |\n"
" -o[neline] | -t[imestamp] | -b[atch] [filename] |\n"
-" -rc[vbuf] [size]}\n");
+" -rc[vbuf] [size] | -n[etns] name }\n");
exit(-1);
}
@@ -262,6 +263,10 @@ int main(int argc, char **argv)
rcvbuf = size;
} else if (matches(opt, "-help") == 0) {
usage();
+ } else if (matches(opt, "-netns") == 0) {
+ NEXT_ARG();
+ if (netns_switch(argv[1]))
+ exit(-1);
} else {
fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
exit(-1);
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 2d42e98..0bae59e 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
\fB\-r\fR[\fIesolve\fR] |
\fB\-f\fR[\fIamily\fR] {
.BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
-\fB\-o\fR[\fIneline\fR] }
+\fB\-o\fR[\fIneline\fR] |
+\fB\-n\fR[\fIetns\fR] name }
.SH OPTIONS
@@ -134,6 +135,26 @@ the output.
use the system's name resolver to print DNS names instead of
host addresses.
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+switches
+.B ip
+to the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B ip
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B ip
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
.SH IP - COMMAND SYNTAX
.SS
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 1/4] lib: Add netns_switch func for change network namespace
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1418493334-23142-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
New netns_switch func moved to the lib/namespace.c from ip/ipnetns.c
so it can be used from the other tools for fast switching
network namespace.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
include/namespace.h | 46 +++++++++++++++++++++++
ip/ipnetns.c | 106 ++--------------------------------------------------
lib/Makefile | 6 ++-
lib/namespace.c | 86 ++++++++++++++++++++++++++++++++++++++++++
4 files changed, 140 insertions(+), 104 deletions(-)
create mode 100644 include/namespace.h
create mode 100644 lib/namespace.c
diff --git a/include/namespace.h b/include/namespace.h
new file mode 100644
index 0000000..2f13e65
--- /dev/null
+++ b/include/namespace.h
@@ -0,0 +1,46 @@
+#ifndef __NAMESPACE_H__
+#define __NAMESPACE_H__ 1
+
+#include <sched.h>
+#include <sys/mount.h>
+#include <errno.h>
+
+#define NETNS_RUN_DIR "/var/run/netns"
+#define NETNS_ETC_DIR "/etc/netns"
+
+#ifndef CLONE_NEWNET
+#define CLONE_NEWNET 0x40000000 /* New network namespace (lo, device, names sockets, etc) */
+#endif
+
+#ifndef MNT_DETACH
+#define MNT_DETACH 0x00000002 /* Just detach from the tree */
+#endif /* MNT_DETACH */
+
+/* sys/mount.h may be out too old to have these */
+#ifndef MS_REC
+#define MS_REC 16384
+#endif
+
+#ifndef MS_SLAVE
+#define MS_SLAVE (1 << 19)
+#endif
+
+#ifndef MS_SHARED
+#define MS_SHARED (1 << 20)
+#endif
+
+#ifndef HAVE_SETNS
+static int setns(int fd, int nstype)
+{
+#ifdef __NR_setns
+ return syscall(__NR_setns, fd, nstype);
+#else
+ errno = ENOSYS;
+ return -1;
+#endif
+}
+#endif /* HAVE_SETNS */
+
+extern int netns_switch(char *netns);
+
+#endif /* __NAMESPACE_H__ */
diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 1c8aa02..519d518 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -17,42 +17,7 @@
#include "utils.h"
#include "ip_common.h"
-
-#define NETNS_RUN_DIR "/var/run/netns"
-#define NETNS_ETC_DIR "/etc/netns"
-
-#ifndef CLONE_NEWNET
-#define CLONE_NEWNET 0x40000000 /* New network namespace (lo, device, names sockets, etc) */
-#endif
-
-#ifndef MNT_DETACH
-#define MNT_DETACH 0x00000002 /* Just detach from the tree */
-#endif /* MNT_DETACH */
-
-/* sys/mount.h may be out too old to have these */
-#ifndef MS_REC
-#define MS_REC 16384
-#endif
-
-#ifndef MS_SLAVE
-#define MS_SLAVE (1 << 19)
-#endif
-
-#ifndef MS_SHARED
-#define MS_SHARED (1 << 20)
-#endif
-
-#ifndef HAVE_SETNS
-static int setns(int fd, int nstype)
-{
-#ifdef __NR_setns
- return syscall(__NR_setns, fd, nstype);
-#else
- errno = ENOSYS;
- return -1;
-#endif
-}
-#endif /* HAVE_SETNS */
+#include "namespace.h"
static int usage(void)
{
@@ -101,42 +66,12 @@ static int netns_list(int argc, char **argv)
return 0;
}
-static void bind_etc(const char *name)
-{
- char etc_netns_path[MAXPATHLEN];
- char netns_name[MAXPATHLEN];
- char etc_name[MAXPATHLEN];
- struct dirent *entry;
- DIR *dir;
-
- snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
- dir = opendir(etc_netns_path);
- if (!dir)
- return;
-
- while ((entry = readdir(dir)) != NULL) {
- if (strcmp(entry->d_name, ".") == 0)
- continue;
- if (strcmp(entry->d_name, "..") == 0)
- continue;
- snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
- snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
- if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
- fprintf(stderr, "Bind %s -> %s failed: %s\n",
- netns_name, etc_name, strerror(errno));
- }
- }
- closedir(dir);
-}
-
static int netns_exec(int argc, char **argv)
{
/* Setup the proper environment for apps that are not netns
* aware, and execute a program in that environment.
*/
- const char *name, *cmd;
- char net_path[MAXPATHLEN];
- int netns;
+ const char *cmd;
if (argc < 1) {
fprintf(stderr, "No netns name specified\n");
@@ -146,45 +81,10 @@ static int netns_exec(int argc, char **argv)
fprintf(stderr, "No command specified\n");
return -1;
}
-
- name = argv[0];
cmd = argv[1];
- snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
- netns = open(net_path, O_RDONLY | O_CLOEXEC);
- if (netns < 0) {
- fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
- name, strerror(errno));
- return -1;
- }
-
- if (setns(netns, CLONE_NEWNET) < 0) {
- fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
- name, strerror(errno));
- return -1;
- }
- if (unshare(CLONE_NEWNS) < 0) {
- fprintf(stderr, "unshare failed: %s\n", strerror(errno));
- return -1;
- }
- /* Don't let any mounts propagate back to the parent */
- if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
- fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
- strerror(errno));
+ if (netns_switch(argv[0]))
return -1;
- }
- /* Mount a version of /sys that describes the network namespace */
- if (umount2("/sys", MNT_DETACH) < 0) {
- fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
- return -1;
- }
- if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
- fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
- return -1;
- }
-
- /* Setup bind mounts for config files in /etc */
- bind_etc(name);
fflush(stdout);
diff --git a/lib/Makefile b/lib/Makefile
index a42b885..66f89f1 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -1,8 +1,12 @@
include ../Config
+ifeq ($(IP_CONFIG_SETNS),y)
+ CFLAGS += -DHAVE_SETNS
+endif
+
CFLAGS += -fPIC
-UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o
+UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o namespace.o
NLOBJ=libgenl.o ll_map.o libnetlink.o
diff --git a/lib/namespace.c b/lib/namespace.c
new file mode 100644
index 0000000..1554ce0
--- /dev/null
+++ b/lib/namespace.c
@@ -0,0 +1,86 @@
+/*
+ * namespace.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <fcntl.h>
+#include <dirent.h>
+
+#include "utils.h"
+#include "namespace.h"
+
+static void bind_etc(const char *name)
+{
+ char etc_netns_path[MAXPATHLEN];
+ char netns_name[MAXPATHLEN];
+ char etc_name[MAXPATHLEN];
+ struct dirent *entry;
+ DIR *dir;
+
+ snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
+ dir = opendir(etc_netns_path);
+ if (!dir)
+ return;
+
+ while ((entry = readdir(dir)) != NULL) {
+ if (strcmp(entry->d_name, ".") == 0)
+ continue;
+ if (strcmp(entry->d_name, "..") == 0)
+ continue;
+ snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
+ snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
+ if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
+ fprintf(stderr, "Bind %s -> %s failed: %s\n",
+ netns_name, etc_name, strerror(errno));
+ }
+ }
+ closedir(dir);
+}
+
+int netns_switch(char *name)
+{
+ char net_path[MAXPATHLEN];
+ int netns;
+
+ snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
+ netns = open(net_path, O_RDONLY | O_CLOEXEC);
+ if (netns < 0) {
+ fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
+ name, strerror(errno));
+ return -1;
+ }
+
+ if (setns(netns, CLONE_NEWNET) < 0) {
+ fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
+ name, strerror(errno));
+ return -1;
+ }
+
+ if (unshare(CLONE_NEWNS) < 0) {
+ fprintf(stderr, "unshare failed: %s\n", strerror(errno));
+ return -1;
+ }
+ /* Don't let any mounts propagate back to the parent */
+ if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
+ fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
+ strerror(errno));
+ return -1;
+ }
+ /* Mount a version of /sys that describes the network namespace */
+ if (umount2("/sys", MNT_DETACH) < 0) {
+ fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
+ return -1;
+ }
+ if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
+ fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
+ return -1;
+ }
+
+ /* Setup bind mounts for config files in /etc */
+ bind_etc(name);
+ return 0;
+}
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 0/4] Switch network ns w/o execvp for iproute2 tools
From: Vadim Kochan @ 2014-12-13 17:55 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
This series adds new -n[etns] option to ip, tc & bridge tools which
allows to easy and faster switch to specified network namespace. So instead of:
ip netns exec NETNS { ip | tc | bridge } OBJECT COMMAND
it will be possible do the same by:
{ ip | tc | bridge } -n[etns] NETNS OBJECT COMMAND
I skipped misc tools and will work on them later.
Vadim Kochan (4):
lib: Add netns_switch func for change network namespace
ip: Allow to easy change network namespace
bridge: Allow to easy change network namespace
tc: Allow to easy change network namespace
bridge/Makefile | 4 ++
bridge/bridge.c | 7 +++-
include/namespace.h | 46 +++++++++++++++++++++++
ip/ip.c | 7 +++-
ip/ipnetns.c | 106 ++--------------------------------------------------
lib/Makefile | 6 ++-
lib/namespace.c | 86 ++++++++++++++++++++++++++++++++++++++++++
man/man8/bridge.8 | 23 +++++++++++-
man/man8/ip.8 | 23 +++++++++++-
man/man8/tc.8 | 65 ++++++++++++++++++++++++--------
tc/Makefile | 5 +++
tc/tc.c | 8 +++-
12 files changed, 262 insertions(+), 124 deletions(-)
create mode 100644 include/namespace.h
create mode 100644 lib/namespace.c
--
2.1.3
^ permalink raw reply
* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: Jiri Pirko @ 2014-12-13 15:20 UTC (permalink / raw)
To: vadim4j; +Cc: netdev
In-Reply-To: <20141213133210.GA12291@angus-think.lan>
Sat, Dec 13, 2014 at 02:32:10PM CET, vadim4j@gmail.com wrote:
>On Sat, Dec 13, 2014 at 10:58:03AM +0200, vadim4j@gmail.com wrote:
>> On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
>> > On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
>> > > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
>> > > >From: Vadim Kochan <vadim4j@gmail.com>
>> > > >
>> > > >Added new '-netns' option to simplify executing following cmd:
>> > > >
>> > > > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>> > > >
>> > > > to
>> > > >
>> > > > ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>> > > >
>> > > >e.g.:
>> > > >
>> > > > ip -net vnet0 link add br0 type bridge
>> > > > ip -n vnet0 link
>> > > >
>> > > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
>> > >
>> > >
>> > > This looks good. I'm still missing support in tc, bridge, etc. I think
>> > > it would be great to do this in the same patch/patchset.
>> > >
>> > I planned to do this in the future patches after this main
>> > changes will be accepted. Actually adding this option to other
>> > tools is trivial.
>> >
>> > Anyway may be I will re-send v5 with supporting of these tools if I will have time.
>> >
>> > Regards,
>>
>> BTW, some tools already have '-n' option, so I think only '-net' can be
>> used in such cases.
Yep, that is my point. I would like to have the same option for all.
>>
>> Regards,
>
>OK, I am going to split changes into series of patches and bring new
>option to : ip, tc, and bridge tools.
>Regarding other misc tools - will do it later as I am not very familiar with them.
>Are you OK with this Jiri ?
Yep. Thank you!
>
>Regards,
^ permalink raw reply
* RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Rosen, Rami @ 2014-12-13 14:39 UTC (permalink / raw)
To: Varlese, Marco, Roopa Prabhu, Jiri Pirko
Cc: John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC4609@IRSMSX108.ger.corp.intel.com>
Hi, all,
Regarding preferring using netlink sockets versus ethtool IOCTLs for setting kernel network attributes from userspace, I fully agree with Marco. The netlink API is much more structured and
much more geared towards this type of operation, than the IOCTL-based ethtool.
Regards,
Rami Rosen
Software Engineer, Intel
-----Original Message-----
From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Varlese, Marco
Sent: Friday, December 12, 2014 11:20
To: Roopa Prabhu; Jiri Pirko
Cc: John Fastabend; netdev@vger.kernel.org; stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com; linux-kernel@vger.kernel.org
Subject: RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
> -----Original Message-----
> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
> Sent: Thursday, December 11, 2014 5:41 PM
> To: Jiri Pirko
> Cc: Varlese, Marco; John Fastabend; netdev@vger.kernel.org;
> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com;
> linux-kernel@vger.kernel.org
> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> configuration
>
> On 12/11/14, 8:56 AM, Jiri Pirko wrote:
> > Thu, Dec 11, 2014 at 05:37:46PM CET, roopa@cumulusnetworks.com wrote:
> >> On 12/11/14, 3:01 AM, Jiri Pirko wrote:
> >>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
> >>>>> -----Original Message-----
> >>>>> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >>>>> Sent: Wednesday, December 10, 2014 5:04 PM
> >>>>> To: Jiri Pirko
> >>>>> Cc: Varlese, Marco; netdev@vger.kernel.org;
> >>>>> stephen@networkplumber.org; Fastabend, John R;
> >>>>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> >>>>> kernel@vger.kernel.org
> >>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch
> >>>>> port configuration
> >>>>>
> >>>>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
> >>>>>> Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
> wrote:
> >>>>>>> From: Marco Varlese <marco.varlese@intel.com>
> >>>>>>>
> >>>>>>> Switch hardware offers a list of attributes that are
> >>>>>>> configurable on a per port basis.
> >>>>>>> This patch provides a mechanism to configure switch ports by
> >>>>>>> adding an NDO for setting specific values to specific attributes.
> >>>>>>> There will be a separate patch that extends iproute2 to call
> >>>>>>> the new NDO.
> >>>>>> What are these attributes? Can you give some examples. I'm
> >>>>>> asking because there is a plan to pass generic attributes to
> >>>>>> switch ports replacing current specific
> >>>>>> ndo_switch_port_stp_update. In this case, bridge is setting that attribute.
> >>>>>>
> >>>>>> Is there need to set something directly from userspace or does
> >>>>>> it make rather sense to use involved bridge/ovs/bond ? I think
> >>>>>> that both will be needed.
> >>>>> +1
> >>>>>
> >>>>> I think for many attributes it would be best to have both. The
> >>>>> in kernel callers and netlink userspace can use the same driver ndo_ops.
> >>>>>
> >>>>> But then we don't _require_ any specific bridge/ovs/etc module.
> >>>>> And we may have some attributes that are not specific to any
> >>>>> existing software module. I'm guessing Marco has some examples
> >>>>> of
> these.
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>
> >>>>> --
> >>>>> John Fastabend Intel Corporation
> >>>> We do have a need to configure the attributes directly from
> >>>> user-space
> and I have identified the tool to do that in iproute2.
> >>>>
> >>>> An example of attributes are:
> >>>> * enabling/disabling of learning of source addresses on a given
> >>>> port (you can imagine the attribute called LEARNING for example);
> >>>> * internal loopback control (i.e. LOOPBACK) which will control
> >>>> how the flow of traffic behaves from the switch fabric towards an
> >>>> egress port;
> >>>> * flooding for broadcast/multicast/unicast type of packets (i.e.
> >>>> BFLOODING, MFLOODING, UFLOODING);
> >>>>
> >>>> Some attributes would be of the type enabled/disabled while other
> >>>> will
> allow specific values to allow the user to configure different
> behaviours of that feature on that particular port on that platform.
> >>>>
> >>>> One thing to mention - as John stated as well - there might be
> >>>> some
> attributes that are not specific to any software module but rather
> have to do with the actual hardware/platform to configure.
> >>>>
> >>>> I hope this clarifies some points.
> >>> It does. Makes sense. We need to expose this attr set/get for both
> >>> in-kernel and userspace use cases.
> >>>
> >>> Please adjust you patch for this. Also, as a second patch, it
> >>> would be great if you can convert ndo_switch_port_stp_update to
> >>> this new
> ndo.
> >> Why are we exposing generic switch attribute get/set from userspace
> >> ?. We already have specific attributes for learning/flooding which
> >> can be extended further.
> > Yes, but that is for PF_BRIDGE and bridge specific attributes. There
> > might be another generic attrs, no?
> I cant think of any. And plus, the whole point of switchdev l2
> offloads was to map these to existing bridge attributes. And we
> already have a match for some of the attributes that marco wants.
>
> If there is a need for such attributes, i don't see why it is needed
> for switch devices only.
> It is needed for any hw (nics etc). And, a precedence to this is to do
> it via ethtool.
>
> Having said that, am sure we will find a need for this in the future.
> And having a netlink attribute always helps.
>
> Today, it seems like these can be mapped to existing attributes that
> are settable via ndo_bridge_setlink/getlink.
>
> >
> >> And for in kernel api....i had a sample patch in my RFC series
> >> (Which i was going to resubmit, until it was decided that we will
> >> use existing api around
> >> ndo_bridge_setlink/ndo_bridge_getlink):
> >> http://www.spinics.net/lists/netdev/msg305473.html
> > Yes, this might become handy for other generic non-bridge attrs.
> >
> >> Thanks,
> >> Roopa
> >>
> >>
> >>
The list I provided is only a subset of the attributes we will need to be exposed. I do have more and I'm sure that more will come in the future. As I mentioned in few posts earlier, some attributes are generic and some are not.
I did not consider ethtool for few reasons but the main one is that I was under the impression that netlink was preferred in many circumstances over the ethotool_ops. Plus, all the cases I have identified so far are going to nicely fit into the setlink set of operations.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
^ permalink raw reply
* 2014/2015 GOOGLE WINNER...CODE: 2944/23D/334,04
From: GOOGLE LOTTO @ 2014-12-13 13:36 UTC (permalink / raw)
To: netdev
Sie sind einer unserer glÃÅcklichen Gewinner in der laufenden 2014/2015 GOOGLE LOTTO fÃÅr Ihre kontinuierliche Nutzung unserer Dienste.
ìberprÃÅfen Sie legen Dokument fÃÅr weitere Details.
........................................................
Your are one of our lucky winners in the ongoing 2014/2015 GOOGLE LOTTO for your continual usage of our services.
Check attach document for more details.
Congratulations from the Staff & Members of Google Incorporated.
Regards,
Dr. Larry Page.
Chairman of the Board and Chief Executive Officer, Google Inc.
^ permalink raw reply
* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13 13:32 UTC (permalink / raw)
To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213085803.GA12446@angus-think.lan>
On Sat, Dec 13, 2014 at 10:58:03AM +0200, vadim4j@gmail.com wrote:
> On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
> > On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> > > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> > > >From: Vadim Kochan <vadim4j@gmail.com>
> > > >
> > > >Added new '-netns' option to simplify executing following cmd:
> > > >
> > > > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> > > >
> > > > to
> > > >
> > > > ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> > > >
> > > >e.g.:
> > > >
> > > > ip -net vnet0 link add br0 type bridge
> > > > ip -n vnet0 link
> > > >
> > > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> > >
> > >
> > > This looks good. I'm still missing support in tc, bridge, etc. I think
> > > it would be great to do this in the same patch/patchset.
> > >
> > I planned to do this in the future patches after this main
> > changes will be accepted. Actually adding this option to other
> > tools is trivial.
> >
> > Anyway may be I will re-send v5 with supporting of these tools if I will have time.
> >
> > Regards,
>
> BTW, some tools already have '-n' option, so I think only '-net' can be
> used in such cases.
>
> Regards,
OK, I am going to split changes into series of patches and bring new
option to : ip, tc, and bridge tools.
Regarding other misc tools - will do it later as I am not very familiar with them.
Are you OK with this Jiri ?
Regards,
^ permalink raw reply
* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13 8:58 UTC (permalink / raw)
To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213084243.GA3284@angus-think.lan>
On Sat, Dec 13, 2014 at 10:42:43AM +0200, vadim4j@gmail.com wrote:
> On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> > Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> > >From: Vadim Kochan <vadim4j@gmail.com>
> > >
> > >Added new '-netns' option to simplify executing following cmd:
> > >
> > > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> > >
> > > to
> > >
> > > ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> > >
> > >e.g.:
> > >
> > > ip -net vnet0 link add br0 type bridge
> > > ip -n vnet0 link
> > >
> > >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> >
> >
> > This looks good. I'm still missing support in tc, bridge, etc. I think
> > it would be great to do this in the same patch/patchset.
> >
> I planned to do this in the future patches after this main
> changes will be accepted. Actually adding this option to other
> tools is trivial.
>
> Anyway may be I will re-send v5 with supporting of these tools if I will have time.
>
> Regards,
BTW, some tools already have '-n' option, so I think only '-net' can be
used in such cases.
Regards,
^ permalink raw reply
* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-13 8:42 UTC (permalink / raw)
To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141213082936.GA1849@nanopsycho.orion>
On Sat, Dec 13, 2014 at 09:29:36AM +0100, Jiri Pirko wrote:
> Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
> >From: Vadim Kochan <vadim4j@gmail.com>
> >
> >Added new '-netns' option to simplify executing following cmd:
> >
> > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> >
> > to
> >
> > ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> >
> >e.g.:
> >
> > ip -net vnet0 link add br0 type bridge
> > ip -n vnet0 link
> >
> >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
>
>
> This looks good. I'm still missing support in tc, bridge, etc. I think
> it would be great to do this in the same patch/patchset.
>
I planned to do this in the future patches after this main
changes will be accepted. Actually adding this option to other
tools is trivial.
Anyway may be I will re-send v5 with supporting of these tools if I will have time.
Regards,
^ permalink raw reply
* Re: [PATCH iproute2 v4] ip: Simplify executing ip cmd within network ns
From: Jiri Pirko @ 2014-12-13 8:29 UTC (permalink / raw)
To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418422507-6635-1-git-send-email-vadim4j@gmail.com>
Fri, Dec 12, 2014 at 11:15:07PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
> ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>
> to
>
> ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
> ip -net vnet0 link add br0 type bridge
> ip -n vnet0 link
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This looks good. I'm still missing support in tc, bridge, etc. I think
it would be great to do this in the same patch/patchset.
>---
> include/namespace.h | 46 +++++++++++++++++++++++
> ip/ip.c | 5 +++
> ip/ipnetns.c | 106 ++--------------------------------------------------
> lib/Makefile | 6 ++-
> lib/namespace.c | 86 ++++++++++++++++++++++++++++++++++++++++++
> man/man8/ip.8 | 23 +++++++++++-
> 6 files changed, 167 insertions(+), 105 deletions(-)
> create mode 100644 include/namespace.h
> create mode 100644 lib/namespace.c
>
>diff --git a/include/namespace.h b/include/namespace.h
>new file mode 100644
>index 0000000..2f13e65
>--- /dev/null
>+++ b/include/namespace.h
>@@ -0,0 +1,46 @@
>+#ifndef __NAMESPACE_H__
>+#define __NAMESPACE_H__ 1
>+
>+#include <sched.h>
>+#include <sys/mount.h>
>+#include <errno.h>
>+
>+#define NETNS_RUN_DIR "/var/run/netns"
>+#define NETNS_ETC_DIR "/etc/netns"
>+
>+#ifndef CLONE_NEWNET
>+#define CLONE_NEWNET 0x40000000 /* New network namespace (lo, device, names sockets, etc) */
>+#endif
>+
>+#ifndef MNT_DETACH
>+#define MNT_DETACH 0x00000002 /* Just detach from the tree */
>+#endif /* MNT_DETACH */
>+
>+/* sys/mount.h may be out too old to have these */
>+#ifndef MS_REC
>+#define MS_REC 16384
>+#endif
>+
>+#ifndef MS_SLAVE
>+#define MS_SLAVE (1 << 19)
>+#endif
>+
>+#ifndef MS_SHARED
>+#define MS_SHARED (1 << 20)
>+#endif
>+
>+#ifndef HAVE_SETNS
>+static int setns(int fd, int nstype)
>+{
>+#ifdef __NR_setns
>+ return syscall(__NR_setns, fd, nstype);
>+#else
>+ errno = ENOSYS;
>+ return -1;
>+#endif
>+}
>+#endif /* HAVE_SETNS */
>+
>+extern int netns_switch(char *netns);
>+
>+#endif /* __NAMESPACE_H__ */
>diff --git a/ip/ip.c b/ip/ip.c
>index 5f759d5..d6c9123 100644
>--- a/ip/ip.c
>+++ b/ip/ip.c
>@@ -22,6 +22,7 @@
> #include "SNAPSHOT.h"
> #include "utils.h"
> #include "ip_common.h"
>+#include "namespace.h"
>
> int preferred_family = AF_UNSPEC;
> int human_readable = 0;
>@@ -262,6 +263,10 @@ int main(int argc, char **argv)
> rcvbuf = size;
> } else if (matches(opt, "-help") == 0) {
> usage();
>+ } else if (matches(opt, "-netns") == 0) {
>+ NEXT_ARG();
>+ if (netns_switch(argv[1]))
>+ exit(-1);
> } else {
> fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
> exit(-1);
>diff --git a/ip/ipnetns.c b/ip/ipnetns.c
>index 1c8aa02..519d518 100644
>--- a/ip/ipnetns.c
>+++ b/ip/ipnetns.c
>@@ -17,42 +17,7 @@
>
> #include "utils.h"
> #include "ip_common.h"
>-
>-#define NETNS_RUN_DIR "/var/run/netns"
>-#define NETNS_ETC_DIR "/etc/netns"
>-
>-#ifndef CLONE_NEWNET
>-#define CLONE_NEWNET 0x40000000 /* New network namespace (lo, device, names sockets, etc) */
>-#endif
>-
>-#ifndef MNT_DETACH
>-#define MNT_DETACH 0x00000002 /* Just detach from the tree */
>-#endif /* MNT_DETACH */
>-
>-/* sys/mount.h may be out too old to have these */
>-#ifndef MS_REC
>-#define MS_REC 16384
>-#endif
>-
>-#ifndef MS_SLAVE
>-#define MS_SLAVE (1 << 19)
>-#endif
>-
>-#ifndef MS_SHARED
>-#define MS_SHARED (1 << 20)
>-#endif
>-
>-#ifndef HAVE_SETNS
>-static int setns(int fd, int nstype)
>-{
>-#ifdef __NR_setns
>- return syscall(__NR_setns, fd, nstype);
>-#else
>- errno = ENOSYS;
>- return -1;
>-#endif
>-}
>-#endif /* HAVE_SETNS */
>+#include "namespace.h"
>
> static int usage(void)
> {
>@@ -101,42 +66,12 @@ static int netns_list(int argc, char **argv)
> return 0;
> }
>
>-static void bind_etc(const char *name)
>-{
>- char etc_netns_path[MAXPATHLEN];
>- char netns_name[MAXPATHLEN];
>- char etc_name[MAXPATHLEN];
>- struct dirent *entry;
>- DIR *dir;
>-
>- snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
>- dir = opendir(etc_netns_path);
>- if (!dir)
>- return;
>-
>- while ((entry = readdir(dir)) != NULL) {
>- if (strcmp(entry->d_name, ".") == 0)
>- continue;
>- if (strcmp(entry->d_name, "..") == 0)
>- continue;
>- snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
>- snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
>- if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
>- fprintf(stderr, "Bind %s -> %s failed: %s\n",
>- netns_name, etc_name, strerror(errno));
>- }
>- }
>- closedir(dir);
>-}
>-
> static int netns_exec(int argc, char **argv)
> {
> /* Setup the proper environment for apps that are not netns
> * aware, and execute a program in that environment.
> */
>- const char *name, *cmd;
>- char net_path[MAXPATHLEN];
>- int netns;
>+ const char *cmd;
>
> if (argc < 1) {
> fprintf(stderr, "No netns name specified\n");
>@@ -146,45 +81,10 @@ static int netns_exec(int argc, char **argv)
> fprintf(stderr, "No command specified\n");
> return -1;
> }
>-
>- name = argv[0];
> cmd = argv[1];
>- snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
>- netns = open(net_path, O_RDONLY | O_CLOEXEC);
>- if (netns < 0) {
>- fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
>- name, strerror(errno));
>- return -1;
>- }
>-
>- if (setns(netns, CLONE_NEWNET) < 0) {
>- fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
>- name, strerror(errno));
>- return -1;
>- }
>
>- if (unshare(CLONE_NEWNS) < 0) {
>- fprintf(stderr, "unshare failed: %s\n", strerror(errno));
>- return -1;
>- }
>- /* Don't let any mounts propagate back to the parent */
>- if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
>- fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
>- strerror(errno));
>+ if (netns_switch(argv[0]))
> return -1;
>- }
>- /* Mount a version of /sys that describes the network namespace */
>- if (umount2("/sys", MNT_DETACH) < 0) {
>- fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
>- return -1;
>- }
>- if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
>- fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
>- return -1;
>- }
>-
>- /* Setup bind mounts for config files in /etc */
>- bind_etc(name);
>
> fflush(stdout);
>
>diff --git a/lib/Makefile b/lib/Makefile
>index a42b885..66f89f1 100644
>--- a/lib/Makefile
>+++ b/lib/Makefile
>@@ -1,8 +1,12 @@
> include ../Config
>
>+ifeq ($(IP_CONFIG_SETNS),y)
>+ CFLAGS += -DHAVE_SETNS
>+endif
>+
> CFLAGS += -fPIC
>
>-UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o
>+UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o namespace.o
>
> NLOBJ=libgenl.o ll_map.o libnetlink.o
>
>diff --git a/lib/namespace.c b/lib/namespace.c
>new file mode 100644
>index 0000000..1554ce0
>--- /dev/null
>+++ b/lib/namespace.c
>@@ -0,0 +1,86 @@
>+/*
>+ * namespace.c
>+ *
>+ * This program is free software; you can redistribute it and/or
>+ * modify it under the terms of the GNU General Public License
>+ * as published by the Free Software Foundation; either version
>+ * 2 of the License, or (at your option) any later version.
>+ */
>+
>+#include <fcntl.h>
>+#include <dirent.h>
>+
>+#include "utils.h"
>+#include "namespace.h"
>+
>+static void bind_etc(const char *name)
>+{
>+ char etc_netns_path[MAXPATHLEN];
>+ char netns_name[MAXPATHLEN];
>+ char etc_name[MAXPATHLEN];
>+ struct dirent *entry;
>+ DIR *dir;
>+
>+ snprintf(etc_netns_path, sizeof(etc_netns_path), "%s/%s", NETNS_ETC_DIR, name);
>+ dir = opendir(etc_netns_path);
>+ if (!dir)
>+ return;
>+
>+ while ((entry = readdir(dir)) != NULL) {
>+ if (strcmp(entry->d_name, ".") == 0)
>+ continue;
>+ if (strcmp(entry->d_name, "..") == 0)
>+ continue;
>+ snprintf(netns_name, sizeof(netns_name), "%s/%s", etc_netns_path, entry->d_name);
>+ snprintf(etc_name, sizeof(etc_name), "/etc/%s", entry->d_name);
>+ if (mount(netns_name, etc_name, "none", MS_BIND, NULL) < 0) {
>+ fprintf(stderr, "Bind %s -> %s failed: %s\n",
>+ netns_name, etc_name, strerror(errno));
>+ }
>+ }
>+ closedir(dir);
>+}
>+
>+int netns_switch(char *name)
>+{
>+ char net_path[MAXPATHLEN];
>+ int netns;
>+
>+ snprintf(net_path, sizeof(net_path), "%s/%s", NETNS_RUN_DIR, name);
>+ netns = open(net_path, O_RDONLY | O_CLOEXEC);
>+ if (netns < 0) {
>+ fprintf(stderr, "Cannot open network namespace \"%s\": %s\n",
>+ name, strerror(errno));
>+ return -1;
>+ }
>+
>+ if (setns(netns, CLONE_NEWNET) < 0) {
>+ fprintf(stderr, "setting the network namespace \"%s\" failed: %s\n",
>+ name, strerror(errno));
>+ return -1;
>+ }
>+
>+ if (unshare(CLONE_NEWNS) < 0) {
>+ fprintf(stderr, "unshare failed: %s\n", strerror(errno));
>+ return -1;
>+ }
>+ /* Don't let any mounts propagate back to the parent */
>+ if (mount("", "/", "none", MS_SLAVE | MS_REC, NULL)) {
>+ fprintf(stderr, "\"mount --make-rslave /\" failed: %s\n",
>+ strerror(errno));
>+ return -1;
>+ }
>+ /* Mount a version of /sys that describes the network namespace */
>+ if (umount2("/sys", MNT_DETACH) < 0) {
>+ fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
>+ return -1;
>+ }
>+ if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
>+ fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
>+ return -1;
>+ }
>+
>+ /* Setup bind mounts for config files in /etc */
>+ bind_etc(name);
>+ return 0;
>+}
>diff --git a/man/man8/ip.8 b/man/man8/ip.8
>index 2d42e98..0fb759d 100644
>--- a/man/man8/ip.8
>+++ b/man/man8/ip.8
>@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
> \fB\-r\fR[\fIesolve\fR] |
> \fB\-f\fR[\fIamily\fR] {
> .BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
>-\fB\-o\fR[\fIneline\fR] }
>+\fB\-o\fR[\fIneline\fR] |
>+\fB\-n\fR[\fIetns\fR] }
>
>
> .SH OPTIONS
>@@ -134,6 +135,26 @@ the output.
> use the system's name resolver to print DNS names instead of
> host addresses.
>
>+.TP
>+.BR "\-n" , " \-net" , " \-netns " <NETNS>
>+switches
>+.B ip
>+to the specified network namespace
>+.IR NETNS .
>+Actually it just simplifies executing of:
>+
>+.B ip netns exec
>+.IR NETNS
>+.B ip
>+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
>+to
>+
>+.B ip
>+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
> .SH IP - COMMAND SYNTAX
>
> .SS
>--
>2.1.3
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Roopa Prabhu @ 2014-12-13 7:06 UTC (permalink / raw)
To: Varlese, Marco
Cc: Jiri Pirko, John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC4609@IRSMSX108.ger.corp.intel.com>
On 12/12/14, 1:19 AM, Varlese, Marco wrote:
>> -----Original Message-----
>> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
>> Sent: Thursday, December 11, 2014 5:41 PM
>> To: Jiri Pirko
>> Cc: Varlese, Marco; John Fastabend; netdev@vger.kernel.org;
>> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com;
>> linux-kernel@vger.kernel.org
>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> configuration
>>
>> On 12/11/14, 8:56 AM, Jiri Pirko wrote:
>>> Thu, Dec 11, 2014 at 05:37:46PM CET, roopa@cumulusnetworks.com wrote:
>>>> On 12/11/14, 3:01 AM, Jiri Pirko wrote:
>>>>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>>>>>>> Sent: Wednesday, December 10, 2014 5:04 PM
>>>>>>> To: Jiri Pirko
>>>>>>> Cc: Varlese, Marco; netdev@vger.kernel.org;
>>>>>>> stephen@networkplumber.org; Fastabend, John R;
>>>>>>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>>>>>>> kernel@vger.kernel.org
>>>>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>>>>>>> configuration
>>>>>>>
>>>>>>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
>>>>>>>> Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
>> wrote:
>>>>>>>>> From: Marco Varlese <marco.varlese@intel.com>
>>>>>>>>>
>>>>>>>>> Switch hardware offers a list of attributes that are
>>>>>>>>> configurable on a per port basis.
>>>>>>>>> This patch provides a mechanism to configure switch ports by
>>>>>>>>> adding an NDO for setting specific values to specific attributes.
>>>>>>>>> There will be a separate patch that extends iproute2 to call the
>>>>>>>>> new NDO.
>>>>>>>> What are these attributes? Can you give some examples. I'm asking
>>>>>>>> because there is a plan to pass generic attributes to switch
>>>>>>>> ports replacing current specific ndo_switch_port_stp_update. In
>>>>>>>> this case, bridge is setting that attribute.
>>>>>>>>
>>>>>>>> Is there need to set something directly from userspace or does it
>>>>>>>> make rather sense to use involved bridge/ovs/bond ? I think that
>>>>>>>> both will be needed.
>>>>>>> +1
>>>>>>>
>>>>>>> I think for many attributes it would be best to have both. The in
>>>>>>> kernel callers and netlink userspace can use the same driver ndo_ops.
>>>>>>>
>>>>>>> But then we don't _require_ any specific bridge/ovs/etc module.
>>>>>>> And we may have some attributes that are not specific to any
>>>>>>> existing software module. I'm guessing Marco has some examples of
>> these.
>>>>>>> [...]
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> John Fastabend Intel Corporation
>>>>>> We do have a need to configure the attributes directly from user-space
>> and I have identified the tool to do that in iproute2.
>>>>>> An example of attributes are:
>>>>>> * enabling/disabling of learning of source addresses on a given
>>>>>> port (you can imagine the attribute called LEARNING for example);
>>>>>> * internal loopback control (i.e. LOOPBACK) which will control how
>>>>>> the flow of traffic behaves from the switch fabric towards an
>>>>>> egress port;
>>>>>> * flooding for broadcast/multicast/unicast type of packets (i.e.
>>>>>> BFLOODING, MFLOODING, UFLOODING);
>>>>>>
>>>>>> Some attributes would be of the type enabled/disabled while other will
>> allow specific values to allow the user to configure different behaviours of
>> that feature on that particular port on that platform.
>>>>>> One thing to mention - as John stated as well - there might be some
>> attributes that are not specific to any software module but rather have to do
>> with the actual hardware/platform to configure.
>>>>>> I hope this clarifies some points.
>>>>> It does. Makes sense. We need to expose this attr set/get for both
>>>>> in-kernel and userspace use cases.
>>>>>
>>>>> Please adjust you patch for this. Also, as a second patch, it would
>>>>> be great if you can convert ndo_switch_port_stp_update to this new
>> ndo.
>>>> Why are we exposing generic switch attribute get/set from userspace
>>>> ?. We already have specific attributes for learning/flooding which
>>>> can be extended further.
>>> Yes, but that is for PF_BRIDGE and bridge specific attributes. There
>>> might be another generic attrs, no?
>> I cant think of any. And plus, the whole point of switchdev l2 offloads was to
>> map these to existing bridge attributes. And we already have a match for
>> some of the attributes that marco wants.
>>
>> If there is a need for such attributes, i don't see why it is needed for switch
>> devices only.
>> It is needed for any hw (nics etc). And, a precedence to this is to do it via
>> ethtool.
>>
>> Having said that, am sure we will find a need for this in the future.
>> And having a netlink attribute always helps.
>>
>> Today, it seems like these can be mapped to existing attributes that are
>> settable via ndo_bridge_setlink/getlink.
>>
>>>> And for in kernel api....i had a sample patch in my RFC series (Which
>>>> i was going to resubmit, until it was decided that we will use
>>>> existing api around
>>>> ndo_bridge_setlink/ndo_bridge_getlink):
>>>> http://www.spinics.net/lists/netdev/msg305473.html
>>> Yes, this might become handy for other generic non-bridge attrs.
>>>
>>>> Thanks,
>>>> Roopa
>>>>
>>>>
>>>>
> The list I provided is only a subset of the attributes we will need to be exposed. I do have more and I'm sure that more will come in the future. As I mentioned in few posts earlier, some attributes are generic and some are not.
>
> I did not consider ethtool for few reasons but the main one is that I was under the impression that netlink was preferred in many circumstances over the ethotool_ops.
That is correct. I don't think anybody hinted that you should extend
ethtool.
> Plus, all the cases I have identified so far are going to nicely fit into the setlink set of operations.
>
Would be better if you submitted your iproute2 patch with this patch.
I do plan to resubmit my generic ndo patch soon.
Thanks,
Roopa
^ permalink raw reply
* Re: [PATCH v2 0/6] net-PPP: Deletion of a few unnecessary checks
From: SF Markus Elfring @ 2014-12-13 6:17 UTC (permalink / raw)
To: David Miller
Cc: Sergei Shtylyov, Paul Mackerras, linux-ppp, netdev, Eric Dumazet,
linux-kernel, kernel-janitors, Julia Lawall
In-Reply-To: <20141212.150741.2169710971698369167.davem@davemloft.net>
> I'd like to honestly ask why you are being so difficult?
There are several factors which contribute to your perception of
difficulty here.
1. I try to extract from every feedback the information about the amount
of acceptance or rejection for a specific update suggestion.
A terse feedback (like yours for this issue) makes it occasionally
harder to see the next useful steps. So another constructive discussion
is evolving around the clarification of some implementation details.
2. I prefer also different communication styles at some points.
3. I reached a point where the desired software updates were not
immediately obvious for me while other contributors might have achieved
a better understanding for the affected issues already.
4. I am on the way at the moment to get my Linux software development
system running again.
https://forums.opensuse.org/showthread.php/503327-System-startup-does-not-continue-after-hard-disk-detection
> Everyone gets their code reviewed, everyone has to modify their
> changes to adhere to the subsystem maintainer's wishes.
That is fine as usual.
> You are not being treated specially, and quite frankly nobody
> is asking anything unreasonable of you.
That is also true as the software development process will be continued.
Regards,
Markus
^ permalink raw reply
* Re: [PATCH v2 0/6] net-PPP: Deletion of a few unnecessary checks
From: SF Markus Elfring @ 2014-12-13 6:05 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Sergei Shtylyov, Paul Mackerras, linux-ppp, netdev,
linux-kernel, kernel-janitors, Julia Lawall
In-Reply-To: <1418411287.13491.22.camel@edumazet-glaptop2.roam.corp.google.com>
> We are in the merge window, tracking bugs added in latest dev cycle.
I am also curious on the software evolution about how many improvements will
arrive in the next Linux versions.
> Having to deal with patches like yours is adding pressure
> on the maintainer (and other developers) at the wrong time.
You can relax a bit eventually. More merge windows will follow, won't they?
It will be nice if a bunch of recent code clean-ups which were also
triggered
by static source code analysis will be integrated into Linux 3.19 already.
More update suggestions will be considered later again as usual.
Regards,
Markus
^ permalink raw reply
* Re: [WTF?] random test in netlink_sendmsg()
From: Al Viro @ 2014-12-13 4:51 UTC (permalink / raw)
To: David Miller; +Cc: kaber, netdev, Dmitry Tarnyagin
In-Reply-To: <20141213032500.GH22149@ZenIV.linux.org.uk>
On Sat, Dec 13, 2014 at 03:25:00AM +0000, Al Viro wrote:
> msg->msg_iter.type == KVEC_ITER &&
ITER_IOVEC, that is. And that way it even works... Are you OK with the
commit below?
netlink: make the check for "send from tx_ring" deterministic
As it is, zero msg_iovlen means that the first iovec in the kernel
array of iovecs is left uninitialized, so checking if its ->iov_base
is NULL is random. Since the real users of that thing are doing
sendto(fd, NULL, 0, ...), they are getting msg_iovlen = 1 and
msg_iov[0] = {NULL, 0}, which is what this test is trying to catch.
As suggested by davem, let's just check that msg_iovlen was 1 and
msg_iov[0].iov_base was NULL - _that_ is well-defined and it catches
what we want to catch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index cc9bcf0..5bcb58c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2304,7 +2304,11 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
goto out;
}
+ /* It's a really convoluted way for userland to ask for mmaped
+ * sendmsg(), but that's what we've got... */
if (netlink_tx_is_mmaped(sk) &&
+ msg->msg_iter.type == ITER_IOVEC &&
+ msg->msg_iter.nr_segs == 1 &&
msg->msg_iter.iov->iov_base == NULL) {
err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
siocb);
^ permalink raw reply related
* Re: [WTF?] random test in netlink_sendmsg()
From: Al Viro @ 2014-12-13 3:25 UTC (permalink / raw)
To: David Miller; +Cc: kaber, netdev, Dmitry Tarnyagin
In-Reply-To: <20141212.213313.1419808296502891420.davem@davemloft.net>
On Fri, Dec 12, 2014 at 09:33:13PM -0500, David Miller wrote:
> Ok, so we just adjust the AF_PACKET check to test msg_iovlen==1 as
> well, and that takes care of that case.
AF_NETLINK, I suppose? AF_PACKET is avoiding all those contortions - they
simply do
if (po->tx_ring.pg_vec)
return tpacket_snd(po, msg);
else
return packet_snd(sock, msg, len);
IOW, if you have done PACKET_TX_RING, you get msg_iov completely ignored and
tx_ring used as data source, otherwise you get data coming from msg_iov.
For AF_NETLINK what you suggest would work, AFAICS. It's still bloody weird,
(if we want to be able to mix from-msg_iov and from-tx_ring sends on the same
socket, I'd probably go with MSG_<something> in flags), but existing userland
ABI is existing userland ABI...
So in term of msg_iter, it turns into
/* It's a really convoluted way for userland to ask for mmaped
* sendmsg(), but that's what we've got... */
if (netlink_tx_is_mmaped(sk) &&
msg->msg_iter.type == KVEC_ITER &&
msg->msg_iter.nr_segs == 1 &&
msg->msg_iter.iov->iov_base == NULL) {
err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
siocb);
goto out;
}
OK, that works...
Next fun place: AF_CAIF/SOCK_SEQPACKET has sendmsg() treating NULL
msg_iov[0].iov_base as EINVAL. No idea why - without that check zero-length
sendmsg() with such msg_iov would work and non-zero-length one would
fail with EFAULT instead. And this check is just as random in case of
msg_iovlen being 0. Could CAIF folks explain what's going on there?
My preference would be to remove that check completely...
^ permalink raw reply
* Re: [PATCH net] net/mlx4_en: correct the endianness of doorbell_qpn on big endian platform
From: Wei Yang @ 2014-12-13 3:13 UTC (permalink / raw)
To: David Laight
Cc: 'Eric Dumazet', David Miller, netdev@vger.kernel.org,
gideonn@mellanox.com, edumazet@google.com, amirv@mellanox.com
In-Reply-To: <20141208144237.GB8382@richard>
On Mon, Dec 08, 2014 at 10:42:37PM +0800, Wei Yang wrote:
>On Mon, Dec 08, 2014 at 10:00:19AM +0000, David Laight wrote:
>>From: Eric Dumazet
>>> On Fri, 2014-12-05 at 21:31 -0800, David Miller wrote:
>>>
>>> > Guys, let's figure out what we are doing with this patch.
>>> > --
>>>
>>> Oh well, patch is fine, please apply it, thanks !
>>
>>I'm not to sure that the patch doesn't generate a software byteswap
>>followed by a byteswapping write on ppc - clearly not ideal.
>>It might even generate back to back software byteswaps.
>>
>>If the write to the doorbell register includes a byteswap on BE (ppc)
>>then there is no real value in keeping the value as BE.
>>
>>OTOH ppc ought to have ways of doing IO writes without the byteswap
>>(and byteswapping accesses to non-io memory for that matter).
>>
>>What happens on a BE system with BE peripherals is another matter.
>
>David
>
>Thanks for your comment.
>
>How about use __raw_writel() to replace the iowrite32()? Looks this is better,
>if so, I will make up another version for this.
>
Hi, David
If you prefer this way, I would like to send a new version for this.
Is it ok for you?
>>
>> David
>>
>
>--
>Richard Yang
>Help you, Help me
--
Richard Yang
Help you, Help me
^ permalink raw reply
* Re: [WTF?] random test in netlink_sendmsg()
From: David Miller @ 2014-12-13 2:33 UTC (permalink / raw)
To: viro; +Cc: kaber, netdev
In-Reply-To: <20141213015415.GG22149@ZenIV.linux.org.uk>
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Sat, 13 Dec 2014 01:54:15 +0000
> On Fri, Dec 12, 2014 at 08:07:58PM -0500, David Miller wrote:
>> From: Al Viro <viro@ZenIV.linux.org.uk>
>> Date: Fri, 12 Dec 2014 21:32:43 +0000
>>
>> > What do we want sendmsg(fd, &msg, 0) to do when fd is AF_NETLINK socket
>> > that had setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, ...) successfully done
>> > to it and msg.msg_iovlen is 0?
>>
>> We had a similar issue with msg_name/msg_namelen and we ended up saying
>> that if msg_namelen is zero then we force msg_name to NULL.
>
> Hmm... The thing is, there might be legitimate users with empty payload,
> making this call for the sake of SCM_CREDENTIALS. IOW, what should happen
> if we have
> msg_iovlen = 0
> msg_iov = <anything>
> msg_control = &cmsg
> msg_controllen = cmsg_len
> Sure, both paths will pass creds, but what about the payload? And the number
> of datagram actually transmitted, for that matter?
Ok, so we just adjust the AF_PACKET check to test msg_iovlen==1 as
well, and that takes care of that case.
Right?
^ permalink raw reply
* Re: [WTF?] random test in netlink_sendmsg()
From: Al Viro @ 2014-12-13 1:54 UTC (permalink / raw)
To: David Miller; +Cc: kaber, netdev
In-Reply-To: <20141212.200758.944592759380344519.davem@davemloft.net>
On Fri, Dec 12, 2014 at 08:07:58PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Fri, 12 Dec 2014 21:32:43 +0000
>
> > What do we want sendmsg(fd, &msg, 0) to do when fd is AF_NETLINK socket
> > that had setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, ...) successfully done
> > to it and msg.msg_iovlen is 0?
>
> We had a similar issue with msg_name/msg_namelen and we ended up saying
> that if msg_namelen is zero then we force msg_name to NULL.
Hmm... The thing is, there might be legitimate users with empty payload,
making this call for the sake of SCM_CREDENTIALS. IOW, what should happen
if we have
msg_iovlen = 0
msg_iov = <anything>
msg_control = &cmsg
msg_controllen = cmsg_len
Sure, both paths will pass creds, but what about the payload? And the number
of datagram actually transmitted, for that matter?
^ permalink raw reply
* Re: [bisected] tg3 broken in 3.18.0?
From: David Miller @ 2014-12-13 1:18 UTC (permalink / raw)
To: nholland; +Cc: netdev
In-Reply-To: <20141213011408.GA27568@teela.fritz.box>
From: Nils Holland <nholland@tisys.org>
Date: Sat, 13 Dec 2014 02:14:08 +0100
> Ok folks,
>
> I now took the time to bisect the issue that killed the tg3 network
> interface on one of my boxes in 3.18.0. Beside me, at least one other
> person was affected, although we also have a confirmed report of
> another person using tg3 without issues under 3.18.0.
>
> My bisect exercise suggests that the following commit is the culprit:
>
> 89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
> ID to identify Configuration Request Retry)
>
> In case that rings a bell for anyone, I'd be more than glad to hear
> about it! Otherwise, while I'm no expert at this, I'll do some more
> investigations tomorrow. It's gotten kind of late during bisecting and
> I'm off for some sleep now. ;-)
You definitely need to bring this up with the author of that change
and the relevent list for the PCI subsystem and/or linux-kernel.
^ permalink raw reply
* [bisected] tg3 broken in 3.18.0?
From: Nils Holland @ 2014-12-13 1:14 UTC (permalink / raw)
To: netdev
In-Reply-To: <20141212203134.GA23705@teela.fritz.box>
Ok folks,
I now took the time to bisect the issue that killed the tg3 network
interface on one of my boxes in 3.18.0. Beside me, at least one other
person was affected, although we also have a confirmed report of
another person using tg3 without issues under 3.18.0.
My bisect exercise suggests that the following commit is the culprit:
89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
ID to identify Configuration Request Retry)
In case that rings a bell for anyone, I'd be more than glad to hear
about it! Otherwise, while I'm no expert at this, I'll do some more
investigations tomorrow. It's gotten kind of late during bisecting and
I'm off for some sleep now. ;-)
Greetings,
Nils
^ permalink raw reply
* Re: [WTF?] random test in netlink_sendmsg()
From: David Miller @ 2014-12-13 1:07 UTC (permalink / raw)
To: viro; +Cc: kaber, netdev
In-Reply-To: <20141212213242.GE22149@ZenIV.linux.org.uk>
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Fri, 12 Dec 2014 21:32:43 +0000
> What do we want sendmsg(fd, &msg, 0) to do when fd is AF_NETLINK socket
> that had setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, ...) successfully done
> to it and msg.msg_iovlen is 0?
We had a similar issue with msg_name/msg_namelen and we ended up saying
that if msg_namelen is zero then we force msg_name to NULL.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox