Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: Network namespace bugs in L2TP
From: Eric W. Biederman @ 2012-12-12 19:44 UTC (permalink / raw)
  To: Tom Parkin; +Cc: netdev
In-Reply-To: <20121212155105.GB2790@raven>

Tom Parkin <tparkin@katalix.com> writes:

> Hi Eric,
>
> I'm following up on this thread from later October in which you
> pointed out some network namespace bugs in L2TP:
>
> http://www.spinics.net/lists/netdev/msg214776.html
>
> I use L2TP, and I'd like to help fix these bugs.  But I'm not very
> conversant with network namespaces, and so I'm struggling to fully
> appreciate the issues you pointed out previously.  Could you give me a
> hand getting to grips with this?
>
> So far I've tested L2TP within network namespaces, using both iproute2
> to create sessions between two namespaces on the same host, and an
> L2TP daemon running in a namespace to create sessions between two
> hosts.  In both cases I've done a bit of trivial ping and iperf
> testing using Ethernet pseudowires.
>
> To make this work I've had to add a couple of trivial patches (see
> below).
>
> There are two things I'm uncertain about:
>
>  1. Why do we need to change the namespace of the socket created in
>     l2tp_tunnel_sock_create?  So far as I can tell, sock_create
>     defaults to the namespace of the calling process.  Is the issue
>     here that this code may run from a work queue or similar?

Something similar.  At the very least l2tp_tunnel_create which calls
l2tp_tunnel_sock_create gets called from netlink.  The network namespace
of a socket is not necessarily the same as the network namespace of the
process that uses that socket.

So since current is not necessarily the right network namespace we need
push the desired network namespace of the socket down into
l2tp_tunnel_sock_create and use that when creating the socket.

>  2. You mentioned the need to keep track of sockets allocated within a
>     namespace in order to be able to clean them up when the namespace
>     is deleted.  Should we be keeping a list of sockets we create and
>     then destroying them in the namespace pernet_ops exit function?

I think the issue that I was referring to and certainly the issue I am
thinking about is the issue where normal sockets hold a reference to a
network namespace and keep the network namespace alive.  Today l2tp uses
sock_create when creating a socket, and as such I think it pins it
current network namespace.  So I believe we can effectively have a
reference counting loop with l2tp sockets pinning the network namespace
and the network namespace keeping the l2tp device alive which keeps the
l2tp socket alive.

I don't remeber the specifics of l2tp as it creates some sockets, and
has other sockets passed in, and as such has rules that are not at all
normal.

Eric

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Stephen Hemminger @ 2012-12-12 19:34 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212190613.GE3060@minipsycho.orion>

On Wed, 12 Dec 2012 20:06:13 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> Wed, Dec 12, 2012 at 07:54:48PM CET, shemminger@vyatta.com wrote:
> >On Wed, 12 Dec 2012 19:49:26 +0100
> >Jiri Pirko <jiri@resnulli.us> wrote:
> >
> >> Wed, Dec 12, 2012 at 07:36:32PM CET, shemminger@vyatta.com wrote:
> >> >On Wed, 12 Dec 2012 19:25:56 +0100
> >> >Jiri Pirko <jiri@resnulli.us> wrote:
> >> >
> >> >> Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
> >> >> >On Wed, 12 Dec 2012 19:10:17 +0100
> >> >> >Jiri Pirko <jiri@resnulli.us> wrote:
> >> >> >
> >> >> >> ># ip li show dev dummy0
> >> >> >> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
> >> >> >> 
> >> >> >> if you mean this "NO-CARRIER"
> >> >> >> it has no direct relation with netif_carrier_ok().
> >> >> >
> >> >> >It is the same value (IFF_RUNNING) that is visible from user space.
> >> >> 
> >> >> static inline bool netif_carrier_ok(const struct net_device *dev)
> >> >> {
> >> >> 	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
> >> >> }
> >> >> 
> >> >> So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
> >> >> bit. Not with IFF_RUNNING flag.
> >> >
> >> >What is the code path that you are worried about netif_carrier_ok being set or clear?
> >> >The interaction here is complex, and right now LINK_STATE_NOCARRIER is purely
> >> >controlled by the driver, your patch changes that, but before acking I want
> >> >to make sure why it is required.
> >> 
> >> This patchset would provide a possibility to set or clear the carrier
> >> from userspace. For dummy device it would serve for direct emulation
> >> of link fail.
> >> 
> >> Also for team deriver, that would serve for teamd (userspace part) to
> >> set the carrier actually on or off (in case of LACP runner for example
> >> this is required).
> >> 
> >
> >You want to able to control the dummy device, so that you can test carrier
> >management in the team device. Another alternative is to use carrier control
> >on a virtual device. Vmware can do it, there were patches to do this with KVM/QEMU
> >not sure if they ever got incorporated.
> >
> >Since this is a specific feature of the dummy device which is specialized for
> >testing, maybe it should just be done by adding device specific ioctl rather
> >than letting it creep in as a general facility.
> 
> Ugh, specific ioctl stinks...
> But this is not only for dummy. As I said, we need this for team driver.
> Maybe I did not explain that correctly. Given the fact that the whole
> Team logic is in userspace, teamd (userspace daemon) needs to set the
> carrier state as if it was done in kernel. Yes, we would be able to do
> this by specific Team option in team driver, but I thought this would be
> nicer to do that more generally.

That is what the operstate mechanism was for. Why did we build that mechanism
if it doesn't work from userspace.

Maybe the fix is to make setting linkstate also set carrier bits.

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Eric W. Biederman @ 2012-12-12 19:25 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: netdev, davem, aatteka
In-Reply-To: <1355332630-4256-1-git-send-email-nicolas.dichtel@6wind.com>

Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:

> The goal of this serie is to ease netns management by daemons. Some systems use
> netns only to virtualize network stack and don't want to multiply userland
> daemons.  These system may have a lot of netns, up to 2000. We don't want to
> launch an instance of each daemons (quagga, strongswan, conntrackd, ...) for
> each netns because it will consume a lot of ressources. Having one daemon that
> manage all netns is more efficient (mainly if there are few objects to manage:
> one or two routes per netns for example).
> Hence, one goal of this serie is to allow, for a daemon, to monitor netns
> activities, thus it can open or close netlink sockets, allocating structures
> needed to manage these netns when they are created or deleted.
> To help to identify a netns, an index has been added to each netns.
>
> A new setsockopt() option is also added, to help daemons to open socket in the
> right netns. For now, a daemon that want to open a socket in a specified netns,
> need to call setns(CLONE_NEWNET) with a fd (not so easy to found), open the
> socket and then call again setns() to go back in the initial netns. Having this
> kind of setsockopt() will simplify operations. Obviously, this setsockopt()
> should be done enough early (is test on sk_state enough?). The first target is
> netlink socket but it can be useful for other kind of socket, it's why a add a
> generic socket option.
>
> As usual, the patch against iproute2 will be sent once the patches are included
> and net-next merged. I can send it on demand.

Short answer you don't need to do any of this.

setns with the namespace files in /proc/<pid>/ns/net gives you more than
enough mechanism to solve this problem.  And iprout2 already supports
all of this.

And your approach creates very serious maintenances problems to the
point I don't even want to read your patches.  What namespace do your
namespace id's live in?

A socketopt to change the namespace of a socket is nasty because sockets
changing which network namespace they are in, leads to races which
aren't worth thinking about writing the code to handle.

Longer answer.

You can bind mount the namespace id's /proc/<pid>/ns/net files to
give you any name you want.  This puts naming policy in userspace
control, and nests just fine.

You can open a socket in any network namespace you want just
by calling setns before socket.  Wrapping this idiom in a library call
or if there is sufficient need in a socketat system call seems
reasonable.

There is a classic question of if two network namespace files refer to
the same network namespace and I have code in linux-next and my pull
request to Linus to give those files a unique inode number.

So please use the facilities already merged into the kernel.

Thank you,
Eric

^ permalink raw reply

* [net-next:master 14/17] net/bridge/br_mdb.c:330 br_mdb_add_group() error: potential null dereference 'mp'. (br_multicast_new_group returns null)
From: kbuild test robot @ 2012-12-12 19:21 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   520dfe3a3645257bf83660f672c47f8558f3d4c4
commit: cfd567543590f71ca0af397437e2554f9756d750 [14/17] bridge: add support of adding and deleting mdb entries


smatch warnings:

+ net/bridge/br_mdb.c:330 br_mdb_add_group() error: potential null dereference 'mp'.  (br_multicast_new_group returns null)

vim +330 +/mp net/bridge/br_mdb.c

cfd56754 Cong Wang 2012-12-11  314  {
cfd56754 Cong Wang 2012-12-11  315  	struct net_bridge_mdb_entry *mp;
cfd56754 Cong Wang 2012-12-11  316  	struct net_bridge_port_group *p;
cfd56754 Cong Wang 2012-12-11  317  	struct net_bridge_port_group __rcu **pp;
cfd56754 Cong Wang 2012-12-11  318  	struct net_bridge_mdb_htable *mdb;
cfd56754 Cong Wang 2012-12-11  319  	int err;
cfd56754 Cong Wang 2012-12-11  320  
cfd56754 Cong Wang 2012-12-11  321  	mdb = mlock_dereference(br->mdb, br);
cfd56754 Cong Wang 2012-12-11  322  	mp = br_mdb_ip_get(mdb, group);
cfd56754 Cong Wang 2012-12-11  323  	if (!mp) {
cfd56754 Cong Wang 2012-12-11  324  		mp = br_multicast_new_group(br, port, group);
cfd56754 Cong Wang 2012-12-11  325  		err = PTR_ERR(mp);
cfd56754 Cong Wang 2012-12-11  326  		if (IS_ERR(mp))
cfd56754 Cong Wang 2012-12-11  327  			return err;
cfd56754 Cong Wang 2012-12-11  328  	}
cfd56754 Cong Wang 2012-12-11  329  
cfd56754 Cong Wang 2012-12-11 @330  	for (pp = &mp->ports;
cfd56754 Cong Wang 2012-12-11  331  	     (p = mlock_dereference(*pp, br)) != NULL;
cfd56754 Cong Wang 2012-12-11  332  	     pp = &p->next) {
cfd56754 Cong Wang 2012-12-11  333  		if (p->port == port)
cfd56754 Cong Wang 2012-12-11  334  			return -EEXIST;
cfd56754 Cong Wang 2012-12-11  335  		if ((unsigned long)p->port < (unsigned long)port)
cfd56754 Cong Wang 2012-12-11  336  			break;
cfd56754 Cong Wang 2012-12-11  337  	}
cfd56754 Cong Wang 2012-12-11  338  

---
0-DAY kernel build testing backend         Open Source Technology Center
Fengguang Wu, Yuanhan Liu                              Intel Corporation

^ permalink raw reply

* Re: [RFC] net : add tx timestamp to packet mmap.
From: David Miller @ 2012-12-12 19:23 UTC (permalink / raw)
  To: Paul.Chavent; +Cc: edumazet, daniel.borkmann, xemul, ebiederm, netdev
In-Reply-To: <1355326165-12277-1-git-send-email-paul.chavent@onera.fr>

You're changing the code that handles sendmsg() and then wondering why
a recvmsg() call doesn't provide a timestamp.

^ permalink raw reply

* Re: [PATCH net-next 4/7] openvswitch: add ipv6 'set' action
From: Jesse Gross @ 2012-12-12 19:17 UTC (permalink / raw)
  To: Tom Herbert
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller, Mike Waychison
In-Reply-To: <CA+mtBx-84PQoHmauNpN4vYLWXcJdESMMep849DQcUAjkmC7PXQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Wed, Dec 12, 2012 at 10:38 AM, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> At an implementation level, the goal is definitely to share as much
>> code as possible.  Some of that was obviously done to support this
>> patch and I'm sure there are more areas where it could be taken
>> further.
>>
>> At a more conceptual level we've explored this path a number of times
>> and it's never been attractive since it has a tendency to drag more
>> OVS code into other parts of the kernel and generally make things
>> worse for everybody.  Of course, it's hard to say without knowing what
>> you're thinking.  Do you have a specific proposal?
>
> Where is the line drawn?  Is the intent that over the next five years
> that functionality will be added ad hoc increments to make OVS have
> the same functionality as IP tables, tc, routing?  Are we going to
> have things like NAT, stateful firewalls, DDOS mechanisms implemented
> in OVS (we already have people proposing such things!).

Definitely no to all of the above. (As an aside, years ago there was
NAT functionality in a precursor to OVS.  Everybody hated it and was
very happy when it was removed, so I wouldn't worry about that type of
thing popping up in OVS any time soon.)

The design of OVS works pretty well for the types of stateless
operations that are currently implemented because those map nicely to
flows that userspace can use to program in a fairly clean and powerful
manner.  This is much less true for things like stateful rules, QoS,
DPI, etc. because you either want to look at more information than
would usually be considered a flow or have state that changes very
quickly.  In these cases, the data plane needs to take action on its
own and the interaction with userspace is more akin to configuration
than programming.

As these types of features come up, I think you will start to see more
integration with netfilter and other tools (in fact, there are several
examples of this already - OVS QoS uses tc, the ability to interact
with skb->mark was added recently, and Pravin has been doing a lot of
work to refactor and integrate with the upstream tunneling code).
There are some definite tradeoffs to doing it this way, mostly in the
area of state management, so I don't think that it's feasible to
switch wholesale over to this model.  However, if we're careful then I
think it's possible to get the best of both worlds.

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Jiri Pirko @ 2012-12-12 19:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212105448.490aca5c@nehalam.linuxnetplumber.net>

Wed, Dec 12, 2012 at 07:54:48PM CET, shemminger@vyatta.com wrote:
>On Wed, 12 Dec 2012 19:49:26 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Wed, Dec 12, 2012 at 07:36:32PM CET, shemminger@vyatta.com wrote:
>> >On Wed, 12 Dec 2012 19:25:56 +0100
>> >Jiri Pirko <jiri@resnulli.us> wrote:
>> >
>> >> Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
>> >> >On Wed, 12 Dec 2012 19:10:17 +0100
>> >> >Jiri Pirko <jiri@resnulli.us> wrote:
>> >> >
>> >> >> ># ip li show dev dummy0
>> >> >> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
>> >> >> 
>> >> >> if you mean this "NO-CARRIER"
>> >> >> it has no direct relation with netif_carrier_ok().
>> >> >
>> >> >It is the same value (IFF_RUNNING) that is visible from user space.
>> >> 
>> >> static inline bool netif_carrier_ok(const struct net_device *dev)
>> >> {
>> >> 	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
>> >> }
>> >> 
>> >> So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
>> >> bit. Not with IFF_RUNNING flag.
>> >
>> >What is the code path that you are worried about netif_carrier_ok being set or clear?
>> >The interaction here is complex, and right now LINK_STATE_NOCARRIER is purely
>> >controlled by the driver, your patch changes that, but before acking I want
>> >to make sure why it is required.
>> 
>> This patchset would provide a possibility to set or clear the carrier
>> from userspace. For dummy device it would serve for direct emulation
>> of link fail.
>> 
>> Also for team deriver, that would serve for teamd (userspace part) to
>> set the carrier actually on or off (in case of LACP runner for example
>> this is required).
>> 
>
>You want to able to control the dummy device, so that you can test carrier
>management in the team device. Another alternative is to use carrier control
>on a virtual device. Vmware can do it, there were patches to do this with KVM/QEMU
>not sure if they ever got incorporated.
>
>Since this is a specific feature of the dummy device which is specialized for
>testing, maybe it should just be done by adding device specific ioctl rather
>than letting it creep in as a general facility.

Ugh, specific ioctl stinks...
But this is not only for dummy. As I said, we need this for team driver.
Maybe I did not explain that correctly. Given the fact that the whole
Team logic is in userspace, teamd (userspace daemon) needs to set the
carrier state as if it was done in kernel. Yes, we would be able to do
this by specific Team option in team driver, but I thought this would be
nicer to do that more generally.

Also, in previous discussion Michał Mirosław wrote he would like this
feature also for GRE tunnel devices.

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Stephen Hemminger @ 2012-12-12 18:54 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212184925.GD3060@minipsycho.orion>

On Wed, 12 Dec 2012 19:49:26 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> Wed, Dec 12, 2012 at 07:36:32PM CET, shemminger@vyatta.com wrote:
> >On Wed, 12 Dec 2012 19:25:56 +0100
> >Jiri Pirko <jiri@resnulli.us> wrote:
> >
> >> Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
> >> >On Wed, 12 Dec 2012 19:10:17 +0100
> >> >Jiri Pirko <jiri@resnulli.us> wrote:
> >> >
> >> >> ># ip li show dev dummy0
> >> >> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
> >> >> 
> >> >> if you mean this "NO-CARRIER"
> >> >> it has no direct relation with netif_carrier_ok().
> >> >
> >> >It is the same value (IFF_RUNNING) that is visible from user space.
> >> 
> >> static inline bool netif_carrier_ok(const struct net_device *dev)
> >> {
> >> 	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
> >> }
> >> 
> >> So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
> >> bit. Not with IFF_RUNNING flag.
> >
> >What is the code path that you are worried about netif_carrier_ok being set or clear?
> >The interaction here is complex, and right now LINK_STATE_NOCARRIER is purely
> >controlled by the driver, your patch changes that, but before acking I want
> >to make sure why it is required.
> 
> This patchset would provide a possibility to set or clear the carrier
> from userspace. For dummy device it would serve for direct emulation
> of link fail.
> 
> Also for team deriver, that would serve for teamd (userspace part) to
> set the carrier actually on or off (in case of LACP runner for example
> this is required).
> 

You want to able to control the dummy device, so that you can test carrier
management in the team device. Another alternative is to use carrier control
on a virtual device. Vmware can do it, there were patches to do this with KVM/QEMU
not sure if they ever got incorporated.

Since this is a specific feature of the dummy device which is specialized for
testing, maybe it should just be done by adding device specific ioctl rather
than letting it creep in as a general facility.

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-12 18:49 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <20121212092236.GB4354@redhat.com>

On Wednesday, December 12, 2012 11:22:36 AM Michael S. Tsirkin wrote:
> On Wed, Dec 05, 2012 at 03:26:19PM -0500, Paul Moore wrote:
> > This patch corrects some problems with LSM/SELinux that were introduced
> > with the multiqueue patchset.  The problem stems from the fact that the
> > multiqueue work changed the relationship between the tun device and its
> > associated socket; before the socket persisted for the life of the
> > device, however after the multiqueue changes the socket only persisted
> > for the life of the userspace connection (fd open).  For non-persistent
> > devices this is not an issue, but for persistent devices this can cause
> > the tun device to lose its SELinux label.
> > 
> > We correct this problem by adding an opaque LSM security blob to the
> > tun device struct which allows us to have the LSM security state, e.g.
> > SELinux labeling information, persist for the lifetime of the tun
> > device.  In the process we tweak the LSM hooks to work with this new
> > approach to TUN device/socket labeling and introduce a new LSM hook,
> > security_tun_dev_create_queue(), to approve requests to create a new
> > TUN queue via TUNSETQUEUE.
> > 
> > The SELinux code has been adjusted to match the new LSM hooks, the
> > other LSMs do not make use of the LSM TUN controls.  This patch makes
> > use of the recently added "tun_socket:create_queue" permission to
> > restrict access to the TUNSETQUEUE operation.  On older SELinux
> > policies which do not define the "tun_socket:create_queue" permission
> > the access control decision for TUNSETQUEUE will be handled according
> > to the SELinux policy's unknown permission setting.

...

> > @@ -465,6 +466,10 @@ static int tun_attach(struct tun_struct *tun, struct
> > file *file)> 
> >  	struct tun_file *tfile = file->private_data;
> >  	int err;
> > 
> > +	err = security_tun_dev_attach(tfile->socket.sk, tun->security);
> > +	if (err < 0)
> > +		goto out;
> > +
> > 
> >  	err = -EINVAL;
> >  	if (rcu_dereference_protected(tfile->tun, lockdep_rtnl_is_held()))
> >  	
> >  		goto out;
> 
> This hook triggers with both set_queue and set_iff,
> and it also seems to trigger when attaching to a
> persistent device and when creating a new one. But I
> believe we might want to be able to allow one but not the other.
> 
> For example:
> 	- we might want to allow qemu to do set_queue but not set_iff
> 	- we might want to configure presistent devices and
> 	  prevent a user from adding new ones

Please look at the rest of the patch and see what the hook actually does.  It 
does not perform any access control under SELinux, all it does is ensure that 
the socket is labeled based on the associated TUN device.

> > - * @tun_dev_post_create:
> > - *	This hook allows a module to update or allocate a per-socket security
> > - *	structure.
> > - *	@sk contains the newly created sock structure.
> 
> I worry that removing a hook hurt users that use it in their
> security policy.

We need to change the hooks because there was a significant change to the 
implementation of a TUN device.

However, even when changing the LSM hooks, we have preserved the SELinux 
access controls for standard, e.g. single queue, TUN devices such that 
existing SELinux policies will work for existing TUN users.  The new SELinux 
access control we added only comes into play when TUN users want to enable 
multiple queues.

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Jiri Pirko @ 2012-12-12 18:49 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212103632.2020efce@nehalam.linuxnetplumber.net>

Wed, Dec 12, 2012 at 07:36:32PM CET, shemminger@vyatta.com wrote:
>On Wed, 12 Dec 2012 19:25:56 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
>> >On Wed, 12 Dec 2012 19:10:17 +0100
>> >Jiri Pirko <jiri@resnulli.us> wrote:
>> >
>> >> ># ip li show dev dummy0
>> >> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
>> >> 
>> >> if you mean this "NO-CARRIER"
>> >> it has no direct relation with netif_carrier_ok().
>> >
>> >It is the same value (IFF_RUNNING) that is visible from user space.
>> 
>> static inline bool netif_carrier_ok(const struct net_device *dev)
>> {
>> 	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
>> }
>> 
>> So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
>> bit. Not with IFF_RUNNING flag.
>
>What is the code path that you are worried about netif_carrier_ok being set or clear?
>The interaction here is complex, and right now LINK_STATE_NOCARRIER is purely
>controlled by the driver, your patch changes that, but before acking I want
>to make sure why it is required.

This patchset would provide a possibility to set or clear the carrier
from userspace. For dummy device it would serve for direct emulation
of link fail.

Also for team deriver, that would serve for teamd (userspace part) to
set the carrier actually on or off (in case of LACP runner for example
this is required).

^ permalink raw reply

* Re: [PATCH 2/2] iproute2: add support to monitor mdb entries too
From: Stephen Hemminger @ 2012-12-12 18:41 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, bridge, Thomas Graf
In-Reply-To: <1355300590-2390-4-git-send-email-amwang@redhat.com>

On Wed, 12 Dec 2012 16:23:10 +0800
Cong Wang <amwang@redhat.com> wrote:

> From: Cong Wang <amwang@redhat.com>
> 
> This patch implements `bridge monitor mdb`.
> 
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: Thomas Graf <tgraf@suug.ch>
> Signed-off-by: Cong Wang <amwang@redhat.com>
> 

Accepted for 3.8 since Dave accepted the kernel parts. Thanks

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Nicolas Dichtel @ 2012-12-12 18:39 UTC (permalink / raw)
  To: netdev; +Cc: davem, ebiederm, aatteka
In-Reply-To: <1355332630-4256-1-git-send-email-nicolas.dichtel@6wind.com>

2012/12/12 Nicolas Dichtel <nicolas.dichtel@6wind.com>:
> The goal of this serie is to ease netns management by daemons. Some systems use
> netns only to virtualize network stack and don't want to multiply userland
> daemons.  These system may have a lot of netns, up to 2000. We don't want to
> launch an instance of each daemons (quagga, strongswan, conntrackd, ...) for
> each netns because it will consume a lot of ressources. Having one daemon that
> manage all netns is more efficient (mainly if there are few objects to manage:
> one or two routes per netns for example).
> Hence, one goal of this serie is to allow, for a daemon, to monitor netns
> activities, thus it can open or close netlink sockets, allocating structures
> needed to manage these netns when they are created or deleted.
> To help to identify a netns, an index has been added to each netns.
>
> A new setsockopt() option is also added, to help daemons to open socket in the
> right netns. For now, a daemon that want to open a socket in a specified netns,
> need to call setns(CLONE_NEWNET) with a fd (not so easy to found), open the
> socket and then call again setns() to go back in the initial netns. Having this
> kind of setsockopt() will simplify operations. Obviously, this setsockopt()
> should be done enough early (is test on sk_state enough?). The first target is
> netlink socket but it can be useful for other kind of socket, it's why a add a
> generic socket option.
>
> As usual, the patch against iproute2 will be sent once the patches are included
> and net-next merged. I can send it on demand.
>
>  arch/alpha/include/asm/socket.h        |   2 +
>  arch/avr32/include/uapi/asm/socket.h   |   2 +
>  arch/frv/include/uapi/asm/socket.h     |   2 +
>  arch/h8300/include/asm/socket.h        |   2 +
>  arch/ia64/include/uapi/asm/socket.h    |   2 +
>  arch/m32r/include/asm/socket.h         |   2 +
>  arch/m68k/include/uapi/asm/socket.h    |   2 +
>  arch/mips/include/uapi/asm/socket.h    |   2 +
>  arch/mn10300/include/uapi/asm/socket.h |   2 +
>  arch/parisc/include/uapi/asm/socket.h  |   2 +
>  arch/powerpc/include/uapi/asm/socket.h |   2 +
>  arch/s390/include/uapi/asm/socket.h    |   2 +
>  arch/sparc/include/uapi/asm/socket.h   |   2 +
>  arch/xtensa/include/uapi/asm/socket.h  |   2 +
>  include/net/net_namespace.h            |   3 +
>  include/uapi/asm-generic/socket.h      |   2 +
>  include/uapi/linux/if_link.h           |   1 +
>  include/uapi/linux/netns.h             |  31 +++++
>  net/core/net_namespace.c               | 223 +++++++++++++++++++++++++++++++++
>  net/core/rtnetlink.c                   |   7 +-
>  net/core/sock.c                        |  28 +++++
>  net/netlink/genetlink.c                |   4 +
>  22 files changed, 326 insertions(+), 1 deletion(-)
>
> I do not pretend to be a netns expert, it's why I add RFC in the title ;-)
>
> Comments are welcome.

Sorry for the double send, it's a wrong manip!

^ permalink raw reply

* Re: [PATCH net-next 4/7] openvswitch: add ipv6 'set' action
From: Tom Herbert @ 2012-12-12 18:38 UTC (permalink / raw)
  To: Jesse Gross
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller, Mike Waychison
In-Reply-To: <CAEP_g=-1aWGsjR55AaD6sLLt4QzbYgUs-3hfNNONrrf8MDwSyA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

> At an implementation level, the goal is definitely to share as much
> code as possible.  Some of that was obviously done to support this
> patch and I'm sure there are more areas where it could be taken
> further.
>
> At a more conceptual level we've explored this path a number of times
> and it's never been attractive since it has a tendency to drag more
> OVS code into other parts of the kernel and generally make things
> worse for everybody.  Of course, it's hard to say without knowing what
> you're thinking.  Do you have a specific proposal?

Where is the line drawn?  Is the intent that over the next five years
that functionality will be added ad hoc increments to make OVS have
the same functionality as IP tables, tc, routing?  Are we going to
have things like NAT, stateful firewalls, DDOS mechanisms implemented
in OVS (we already have people proposing such things!).

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Stephen Hemminger @ 2012-12-12 18:36 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212182556.GC3060@minipsycho.orion>

On Wed, 12 Dec 2012 19:25:56 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
> >On Wed, 12 Dec 2012 19:10:17 +0100
> >Jiri Pirko <jiri@resnulli.us> wrote:
> >
> >> ># ip li show dev dummy0
> >> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
> >> 
> >> if you mean this "NO-CARRIER"
> >> it has no direct relation with netif_carrier_ok().
> >
> >It is the same value (IFF_RUNNING) that is visible from user space.
> 
> static inline bool netif_carrier_ok(const struct net_device *dev)
> {
> 	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
> }
> 
> So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
> bit. Not with IFF_RUNNING flag.

What is the code path that you are worried about netif_carrier_ok being set or clear?
The interaction here is complex, and right now LINK_STATE_NOCARRIER is purely
controlled by the driver, your patch changes that, but before acking I want
to make sure why it is required.

^ permalink raw reply

* Re: [PATCH 6/6] netfilter: nf_nat: Handle routing changes in MASQUERADE target
From: Jozsef Kadlecsik @ 2012-12-12 18:37 UTC (permalink / raw)
  To: Andrew Collins; +Cc: netfilter-devel, netdev
In-Reply-To: <CAKTPYJQn_vVg+f1Nvbe=hU7Xzw7mX6Xw7ZR4Tz2Bpd49792-rg@mail.gmail.com>

On Tue, 11 Dec 2012, Andrew Collins wrote:

> On Tue, Dec 4, 2012 at 10:31 AM, <pablo@netfilter.org> wrote:
> >
> > From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> >
> > When the route changes (backup default route, VPNs) which affect a
> > masqueraded target, the packets were sent out with the outdated source
> > address. The patch addresses the issue by comparing the outgoing interface
> > directly with the masqueraded interface in the nat table.
> >
> > Events are inefficient in this case, because it'd require adding route
> > events to the network core and then scanning the whole conntrack table
> > and re-checking the route for all entry.
> >
> > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> 
> Jozsef, a small question about this change.  Should this same check
> not exist here:
> 
>         case IP_CT_NEW:
>                 /* Seen it before?  This can happen for loopback, retrans,
>                  * or local packets.
>                  */
>                 if (!nf_nat_initialized(ct, maniptype)) {
>                         unsigned int ret;
> 
>                         ret = nf_nat_rule_find(skb, hooknum, in, out, ct);
>                         if (ret != NF_ACCEPT)
>                                 return ret;
> -               } else
> +               } else {
>                         pr_debug("Already setup manip %s for ct %p\n",
>                                  maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
>                                  ct);
> +                       if (nf_nat_oif_changed(hooknum, ctinfo, nat, out)) {
> +                               nf_ct_kill_acct(ct, ctinfo, skb);
> +                               return NF_DROP;
> +                       }
> +               }
>                 break;
> 
> as well?  It's *significantly* less common than the case you fixed,
> and perhaps just letting the state time out is acceptable, but I've
> seen TCP connections get stuck with the wrong source address if we
> haven't hit ESTABLISHED at the point when the routing change occurs
> (most reproducible on high latency links).

It is a less common case, but I think you are right: the timeout can take 
several minutes. But instead of repeating the code segment, a "goto" case 
handling were better. Are you going to submit a patch?

Best regards,
Jozsef

-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply

* Re: [PATCH V1 net-next 1/3] net: ethtool: Add destination MAC address to flow steering API
From: Ben Hutchings @ 2012-12-12 18:28 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, Hadar Har-Zion, Yan Burman
In-Reply-To: <1355314400-14909-2-git-send-email-amirv@mellanox.com>

On Wed, 2012-12-12 at 14:13 +0200, Amir Vadai wrote:
> From: Yan Burman <yanb@mellanox.com>
> 
> Add ability to specify destination MAC address for L3/L4 flow spec
> in order to be able to specify action for different VM's under vSwitch
> configuration. This change is transparent to older userspace.
> 
> Signed-off-by: Yan Burman <yanb@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
>  include/uapi/linux/ethtool.h | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index d3eaaaf..be8c41e 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -500,13 +500,15 @@ union ethtool_flow_union {
>  	struct ethtool_ah_espip4_spec		esp_ip4_spec;
>  	struct ethtool_usrip4_spec		usr_ip4_spec;
>  	struct ethhdr				ether_spec;
> -	__u8					hdata[60];
> +	__u8					hdata[52];
>  };
>  
>  struct ethtool_flow_ext {
> -	__be16	vlan_etype;
> -	__be16	vlan_tci;
> -	__be32	data[2];
> +	__u8		padding[2];
> +	unsigned char	h_dest[ETH_ALEN];	/* destination eth addr	*/
> +	__be16		vlan_etype;
> +	__be16		vlan_tci;
> +	__be32		data[2];
>  };
>  
>  /**
> @@ -1027,6 +1029,7 @@ enum ethtool_sfeatures_retval_bits {
>  #define	ETHER_FLOW	0x12	/* spec only (ether_spec) */
>  /* Flag to enable additional fields in struct ethtool_rx_flow_spec */
>  #define	FLOW_EXT	0x80000000
> +#define	FLOW_MAC_EXT	0x40000000

Please can you send another patch that adds kernel-doc to struct
ethtool_flow_ext explaining which fields are dependent on which flags.

Ben.

>  /* L3-L4 network traffic flow hash options */
>  #define	RXH_L2DA	(1 << 1)

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH iproute2] ip: use rtnelink to manage mroute
From: Stephen Hemminger @ 2012-12-12 18:26 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: netdev
In-Reply-To: <1355304728-4944-1-git-send-email-nicolas.dichtel@6wind.com>

On Wed, 12 Dec 2012 10:32:08 +0100
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:

> mroute was using /proc/net/ip_mr_[vif|cache] to display mroute entries. Hence,
> only RT_TABLE_DEFAULT was displayed and only IPv4.
> With rtnetlink, it is possible to display all tables for IPv4 and IPv6. The output
> format is kept. Also, like before the patch, statistics are displayed when user specify
> the '-s' argument.
> 
> The patch also adds the support of 'ip monitor mroute', which is now possible.
> 
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---

The functionality is fine, and if you clean it up I will accept it.

Patch does not apply cleanly against current iproute2 git.

Also, it causes several compiler warnings.
ipmonitor.c: In function ‘accept_msg’:
ipmonitor.c:50:20: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘long unsigned int’ [-Wformat]

ipmroute.c: In function ‘print_mroute’:
ipmroute.c:165:4: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘__u64’ [-Wformat]
ipmroute.c:165:4: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘__u64’ [-Wformat]
ipmroute.c:168:5: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘__u64’ [-Wformat]

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Jiri Pirko @ 2012-12-12 18:25 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212101208.361ccda0@nehalam.linuxnetplumber.net>

Wed, Dec 12, 2012 at 07:12:08PM CET, shemminger@vyatta.com wrote:
>On Wed, 12 Dec 2012 19:10:17 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> ># ip li show dev dummy0
>> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
>> 
>> if you mean this "NO-CARRIER"
>> it has no direct relation with netif_carrier_ok().
>
>It is the same value (IFF_RUNNING) that is visible from user space.

static inline bool netif_carrier_ok(const struct net_device *dev)
{
	        return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

So netif_carrier[ok/on/off] are working with on __LINK_STATE_NOCARRIER
bit. Not with IFF_RUNNING flag.

^ permalink raw reply

* Re: [PATCH net-next 4/7] openvswitch: add ipv6 'set' action
From: Jesse Gross @ 2012-12-12 18:17 UTC (permalink / raw)
  To: Tom Herbert
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller
In-Reply-To: <CA+mtBx-Zf9FNf11H9RM12etHnJ1bPpM_Eyc4mR7E6xsb7sUP2Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, Dec 11, 2012 at 7:14 PM, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> This patch adds ipv6 set action functionality. It allows to change
>> traffic class, flow label, hop-limit, ipv6 source and destination
>> address fields.
>>
> I have to wonder about these patches and the underlying design
> direction.  Aren't these sort of things and more already implemented
> by IPtables but in a modular and extensible fashion?  Has there been
> any thought into hooking OVS to IP tables to leverage all the existing
> functionality?

At an implementation level, the goal is definitely to share as much
code as possible.  Some of that was obviously done to support this
patch and I'm sure there are more areas where it could be taken
further.

At a more conceptual level we've explored this path a number of times
and it's never been attractive since it has a tendency to drag more
OVS code into other parts of the kernel and generally make things
worse for everybody.  Of course, it's hard to say without knowing what
you're thinking.  Do you have a specific proposal?

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Stephen Hemminger @ 2012-12-12 18:12 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212181017.GB3060@minipsycho.orion>

On Wed, 12 Dec 2012 19:10:17 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> ># ip li show dev dummy0
> >12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT   
> 
> if you mean this "NO-CARRIER"
> it has no direct relation with netif_carrier_ok().

It is the same value (IFF_RUNNING) that is visible from user space.

^ permalink raw reply

* Re: [patch net-next 0/4] net: allow to change carrier from userspace
From: Jiri Pirko @ 2012-12-12 18:10 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, davem, edumazet, bhutchings, mirqus, greearb, fbl
In-Reply-To: <20121212092700.7ef2607a@nehalam.linuxnetplumber.net>

Wed, Dec 12, 2012 at 06:27:00PM CET, shemminger@vyatta.com wrote:
>On Wed, 12 Dec 2012 18:05:20 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Wed, Dec 12, 2012 at 05:15:00PM CET, shemminger@vyatta.com wrote:
>> >On Wed, 12 Dec 2012 11:58:03 +0100
>> >Jiri Pirko <jiri@resnulli.us> wrote:
>> >
>> >> This is basically a repost of my previous patchset:
>> >> "[patch net-next-2.6 0/2] net: allow to change carrier via sysfs" from Aug 30
>> >> 
>> >> The way net-sysfs stores values changed and this patchset reflects it.
>> >> Also, I exposed carrier via rtnetlink iface.
>> >> 
>> >> So far, only dummy driver uses carrier change ndo. In very near future
>> >> team driver will use that as well.
>> >> 
>> >> Jiri Pirko (4):
>> >>   net: add change_carrier netdev op
>> >>   net: allow to change carrier via sysfs
>> >>   rtnl: expose carrier value with possibility to set it
>> >>   dummy: implement carrier change
>> >> 
>> >>  drivers/net/dummy.c          | 10 ++++++++++
>> >>  include/linux/netdevice.h    |  7 +++++++
>> >>  include/uapi/linux/if_link.h |  1 +
>> >>  net/core/dev.c               | 19 +++++++++++++++++++
>> >>  net/core/net-sysfs.c         | 15 ++++++++++++++-
>> >>  net/core/rtnetlink.c         | 10 ++++++++++
>> >>  6 files changed, 61 insertions(+), 1 deletion(-)
>> >> 
>> >
>> >I needed to do the same thing for a project we are working on and discovered
>> >that there already is a working documented interface for doing that via
>> >operstate mode. Therefore I can't recommend that the additional complexity
>> >of a new API for this is required.
>> 
>> I might be missing something, but I'm unable to find how operstate set
>> can affect value returned by netif_carrier_ok()
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>Here is an example using dummy device using libmnl. It is also possible
>with ip commands.
>
># modprobe dummy
># ip li show dev dummy0
>12: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT 
>    link/ether ce:90:46:83:6e:f8 brd ff:ff:ff:ff:ff:ff
># ./dummy dummy0 init
># ip li show dev dummy0
>12: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DORMANT 
>    link/ether ce:90:46:83:6e:f8 brd ff:ff:ff:ff:ff:ff
># ip li set dummy0 up
># ip li show dev dummy0
>12: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DORMANT 
>    link/ether ce:90:46:83:6e:f8 brd ff:ff:ff:ff:ff:ff
># ./dummy dummy0 down
># ip li show dev dummy0
>12: dummy0: <NO-CARRIER,BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state DORMANT mode DORMANT 

if you mean this "NO-CARRIER"
it has no direct relation with netif_carrier_ok().


>    link/ether ce:90:46:83:6e:f8 brd ff:ff:ff:ff:ff:ff
># ./dummy dummy0 up
># ip li show dev dummy0
>12: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DORMANT 
>    link/ether ce:90:46:83:6e:f8 brd ff:ff:ff:ff:ff:ff
>
>
>/* Sample program to control link mode and link state */
>#include <stdio.h>
>#include <stdlib.h>
>#include <unistd.h>
>#include <string.h>
>#include <time.h>
>#include <errno.h>
>#include <sys/types.h>
>#include <sys/fcntl.h>
>#include <sys/ioctl.h>
>#include <libmnl/libmnl.h>
>
>#include <linux/if.h>
>#include <linux/if_tun.h>
>#include <linux/rtnetlink.h>
>
>static void panic(const char *str)
>{
>	perror(str);
>	exit(1);
>}
>
>static void usage(const char *cmd)
>{
>	fprintf(stderr, "Usage: %s dummyX [up|down|init]\n", cmd);
>	exit(1);
>}
>
>/* Send request and parse response */
>static void mnl_talk(struct mnl_socket *nl, struct nlmsghdr *nlh)
>{
>	unsigned portid = mnl_socket_get_portid(nl);
>	uint32_t seq = time(NULL);
>	char buf[MNL_SOCKET_BUFFER_SIZE];
>
>	nlh->nlmsg_flags |= NLM_F_ACK;
>	nlh->nlmsg_seq = seq;
>
>	if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0)
>		panic("mnl_socket_sendto failed");
>
>	int ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
>	if (ret < 0)
>		panic("mnl_socket_recvfrom");
>
>	if ( mnl_cb_run(buf, ret, seq, portid, NULL, NULL) < 0)
>		panic("mnl_cb_run");
>}
>
>static void linkstate(struct mnl_socket *nl,
>		      const char *ifname, unsigned int state)
>{
>	char buf[MNL_SOCKET_BUFFER_SIZE];
>	struct nlmsghdr *nlh = mnl_nlmsg_put_header(buf);
>	nlh->nlmsg_type = RTM_NEWLINK;
>	nlh->nlmsg_flags = NLM_F_REQUEST;
>
>	struct ifinfomsg *ifi;
>	ifi = mnl_nlmsg_put_extra_header(nlh, sizeof(struct ifinfomsg));
>	ifi->ifi_family = AF_UNSPEC;
>
>	mnl_attr_put_strz(nlh, IFLA_IFNAME, ifname);
>	mnl_attr_put_u8(nlh, IFLA_OPERSTATE, state);
>
>	mnl_talk(nl, nlh);
>}
>
>/* Set device link mode */
>static void init(struct mnl_socket *nl, const char *ifname)
>{
>	char buf[MNL_SOCKET_BUFFER_SIZE];
>	struct nlmsghdr *nlh = mnl_nlmsg_put_header(buf);
>	nlh->nlmsg_type = RTM_NEWLINK;
>	nlh->nlmsg_flags = NLM_F_REQUEST;
>
>	struct ifinfomsg *ifi;
>	ifi = mnl_nlmsg_put_extra_header(nlh, sizeof(struct ifinfomsg));
>	ifi->ifi_family = AF_UNSPEC;
>	
>	mnl_attr_put_strz(nlh, IFLA_IFNAME, ifname);
>	mnl_attr_put_u8(nlh, IFLA_LINKMODE, IF_LINK_MODE_DORMANT);
>	mnl_talk(nl, nlh);
>}
>
>int main(int argc, char **argv)
>{
>	if (argc != 3)
>		usage(argv[0]);
>
>	struct mnl_socket *nl = mnl_socket_open(NETLINK_ROUTE);
>	if (!nl)
>		panic("mnl_socket_open");
>
>	if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0)
>		panic("mnl_socket_bind");
>	
>
>	if (strcmp(argv[2], "init") == 0)
>		init(nl, argv[1]);
>	else if (strcmp(argv[2], "up") == 0)
>		linkstate(nl, argv[1], IF_OPER_UP);
>	else if (strcmp(argv[2], "down") == 0)
>		linkstate(nl, argv[1], IF_OPER_DORMANT);
>	else
>		usage(argv[0]);
>
>	return 0;
>}
>
>

^ permalink raw reply

* NET development closed...
From: David Miller @ 2012-12-12 18:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless, netfilter-devel

We're in the merge window, that means only bug fixes from now until
the merge window closes and I make a posting notifying everyone
that net-next is open again.

I should be sending a pull request to Linus later today.

Thanks.

^ permalink raw reply

* Re: [PATCHv3 iproute2] add DOVE extensions for iproute2
From: Stephen Hemminger @ 2012-12-12 18:03 UTC (permalink / raw)
  To: David L Stevens; +Cc: David Miller, netdev
In-Reply-To: <201212121756.qBCHtOfn021538@lab1.dls>

On Wed, 12 Dec 2012 12:55:24 -0500
David L Stevens <dlstevens@us.ibm.com> wrote:

> 	This patch adds a new flag to iproute2 for vxlan devices to enable
> DOVE features. It also adds support for L2 and L3 switch lookup miss
> netlink messages to "ip monitor".
> 
> Changes since v2: fix merge conflict
> Changes since v1:
> 	- split "dove" flag into separate feature flags:
> 		- "proxy" for ARP reduction
> 		- "rsc" for route short circuiting
> 		- "l2miss" for L2 switch miss notifications
> 		- "l3miss" for L3 switch miss notifications
> 
> Signed-off-by: David L Stevens <dlstevens@us.ibm.com>

Applied, after mollifying the git whitespace complaints.

^ permalink raw reply

* Re: [PATCH net-next 2/2] bridge: add support of adding and deleting mdb entries
From: David Miller @ 2012-12-12 18:03 UTC (permalink / raw)
  To: amwang; +Cc: netdev, bridge, herbert, shemminger, tgraf
In-Reply-To: <1355300590-2390-2-git-send-email-amwang@redhat.com>

From: Cong Wang <amwang@redhat.com>
Date: Wed, 12 Dec 2012 16:23:08 +0800

> From: Cong Wang <amwang@redhat.com>
> 
> This patch implents adding/deleting mdb entries via netlink.
> Currently all entries are temp, we probably need a flag to distinguish
> permanent entries too.
> 
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Signed-off-by: Cong Wang <amwang@redhat.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 1/2] bridge: notify mdb changes via netlink
From: David Miller @ 2012-12-12 18:03 UTC (permalink / raw)
  To: amwang; +Cc: tgraf, netdev, shemminger, bridge, herbert
In-Reply-To: <1355300590-2390-1-git-send-email-amwang@redhat.com>

From: Cong Wang <amwang@redhat.com>
Date: Wed, 12 Dec 2012 16:23:07 +0800

> From: Cong Wang <amwang@redhat.com>
> 
> As Stephen mentioned, we need to monitor the mdb
> changes in user-space, so add notifications via netlink too.
> 
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Signed-off-by: Cong Wang <amwang@redhat.com>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox