Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] LSM: Add post recvmsg() hook.
From: Tetsuo Handa @ 2010-07-22 12:46 UTC (permalink / raw)
  To: davem
  Cc: kuznet, pekkas, jmorris, yoshfuji, kaber, paul.moore, netdev,
	linux-security-module
In-Reply-To: <20100721.220611.267376790.davem@davemloft.net>

David Miller wrote:
> > Then, why does below proposal lose information?
> 
> Peek changes state, now it's possible that two processes end up
> receiving the packet.

Indeed. We will need to protect sock->ops->recvmsg() call using a lock like

 static inline int __sock_recvmsg_nosec(struct kiocb *iocb, struct socket *sock,
 				       struct msghdr *msg, size_t size, int flags)
 {
+	int err;
 	struct sock_iocb *si = kiocb_to_siocb(iocb);
 
 	sock_update_classid(sock->sk);
 
 	si->sock = sock;
 	si->scm = NULL;
 	si->msg = msg;
 	si->size = size;
 	si->flags = flags;
 
-	return sock->ops->recvmsg(iocb, sock, msg, size, flags);
+	err = security_socket_read_lock(sock);
+	if (err)
+		return err;
+	err = sock->ops->recvmsg(iocb, sock, msg, size, flags);
+	security_socket_read_unlock(sock);
+	return err;
 }

in addition to security_socket_recvmsg_force_peek() and
security_socket_post_recvmsg().

But locks like above break MSG_DONTWAIT since recv() without MSG_DONTWAIT
calls wait_for_packet() inside __skb_recv_datagram().
To make MSG_DONTWAIT work, I have to do like below.

 struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
 				    int *peeked, int *err)
(...snipped...)
 	do {
 		/* Again only user level code calls this function, so nothing
 		 * interrupt level will suddenly eat the receive_queue.
 		 *
 		 * Look at current nfs client by the way...
 		 * However, this function was corrent in any case. 8)
 		 */
 		unsigned long cpu_flags;
 
+		/* < 0 if lock failed, 0 if no need to lock, > 0 if locked */
+		int serialized = security_socket_read_lock(sk);
+		if (serialized < 0) {
+			error = serialized;
+			goto no_packet;
+		} else if (serialized > 0) {
+			int err;
+			spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
+			skb = skb_peek(&sk->sk_receive_queue);
+			spin_unlock_irqrestore(&sk->sk_receive_queue.lock,
+					       cpu_flags);
+			if (!skb)
+				goto no_skb;
+			err = security_socket_pre_recvmsg(sk, skb);
+			if (err < 0) {
+				error = err;
+				security_socket_read_unlock(sk);
+				goto no_packet;
+			}
+		}
+
 		spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
 		skb = skb_peek(&sk->sk_receive_queue);
 		if (skb) {
 			*peeked = skb->peeked;
 			if (flags & MSG_PEEK) {
 				skb->peeked = 1;
 				atomic_inc(&skb->users);
 			} else
 				__skb_unlink(skb, &sk->sk_receive_queue);
 		}
 		spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
 
+no_skb:
+		if (serialized > 0)
+			security_socket_read_unlock(sk);
 		if (skb)
 			return skb;
 
 		/* User doesn't want to wait */
 		error = -EAGAIN;
 		if (!timeo)
 			goto no_packet;
 
 	} while (!wait_for_packet(sk, err, &timeo));
(...snipped...)
 }

Inserting LSM hooks like above will be the only way to work properly (i.e.
handle MSG_DONTWAIT and avoid showing the same message to multiple readers
and keep the queue's state unchanged upon error).
But you said ( http://marc.info/?l=linux-netdev&m=124022463014713&w=2 )

> We worked so hard to split out this common code, it is simply
> a non-starter for anyone to start putting protocol specific test
> into here, or even worse to move this code back to being locally
> copied into every protocol implementation.

when I proposed inserting LSM hooks into __skb_recv_datagram()
( http://marc.info/?l=linux-netdev&m=124022463014672&w=2 ).
So, I have no way to allow performing permission checks based on combination of
"process who issued recv() request" and "source address/port of the message
which the process is about to pick up" without breaking things (unless you
accept inserting LSM hooks into __skb_recv_datagram())...

^ permalink raw reply

* Re: [PATCH V4] CAN: Add Flexcan CAN controller driver
From: Wolfgang Grandegger @ 2010-07-22 12:35 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1279746286-19736-1-git-send-email-mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

On 07/21/2010 11:04 PM, Marc Kleine-Budde wrote:
> This core is found on some Freescale SoCs and also some Coldfire
> SoCs. Support for Coldfire is missing though at the moment as
> they have an older revision of the core which does not have RX FIFO
> support.
> 
> Signed-off-by: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
> Signed-off-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

Acked-by: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>

Thanks for your contribution.

Wolfgang.

^ permalink raw reply

* Another oops + repost
From: Martín Ferrari @ 2010-07-22 12:24 UTC (permalink / raw)
  To: netdev
  Cc: Ben Hutchings, 577640, Eric W. Biederman, Alexey Dobriyan,
	Mathieu Lacage, Daniel Lezcano

Hi again,

First of all, I would like to know if anybody was able to fix this
problem  that got kinda lost in the thread:

On Thu, Apr 22, 2010 at 16:05, Martín Ferrari <martin.ferrari@gmail.com> wrote:

> I have just downloaded and compiled 2.6.32-2 and 2.6.34-rc5 from
> kernel.org using the .config from the debian package, and the oops is
> reproducible in both.
>
> This small C file reproduces the error every time:
>
> $ cat netnsoops.c
> #include <stdio.h>
> #include <stdlib.h>
> #define _GNU_SOURCE
> #include <sched.h>
>
> int main(int argc, char *argv[])
> {
>        int c;
>        unsigned long flags = CLONE_NEWNET;
>
>        if(unshare(flags) == -1) {
>                perror("unshare");
>                return 1;
>        }
>        system("ip link add name FOO type veth peer name BAR");
>        system("ip link set FOO netns 1");
>        system("ip link show");
>        return 0;
> }

Secondly, I discovered another related bug, just more subtle.

I'm creating a dummy device, moving it into a name space and then
taking it back to netns 1. Later, when I delete the dummy, I get an
oops:

[  610.540091] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000028
[  610.541512] IP: [<ffffffff81138f71>] sysfs_find_dirent+0x9/0x2f
[  610.542369] PGD 3799d067 PUD 37bb3067 PMD 0
[  610.543370] Oops: 0000 [#1] SMP
[  610.544018] last sysfs file: /sys/devices/virtual/net/lo/operstate
[  610.544018] CPU 0
[  610.544018] Modules linked in: dummy loop parport_pc parport
snd_pcm snd_timer tpm_tis tpm tpm_bios snd soundcore snd_page_alloc
pcspkr psmouse serio_raw i2c_piix4 evdev i2c_core button processor
ext3 jbd mbcache ide_cd_mod cdrom ide_gd_mod ata_generic ata_piix
8139too libata scsi_mod floppy piix 8139cp mii thermal thermal_sys
ide_core [last unloaded: scsi_wait_scan]
[  610.544018]
[  610.544018] Pid: 1359, comm: ip Tainted: G        W  2.6.34-rc5 #1 /
[  610.544018] RIP: 0010:[<ffffffff81138f71>]  [<ffffffff81138f71>]
sysfs_find_dirent+0x9/0x2f
[  610.544018] RSP: 0018:ffff88003789f988  EFLAGS: 00010296
[  610.544018] RAX: ffff88003789e000 RBX: 0000000000000000 RCX: ffff88007d3d2cd8
[  610.544018] RDX: 0000000000000003 RSI: ffffffff814a6352 RDI: 0000000000000000
[  610.544018] RBP: ffffffff814a6352 R08: 000000037f80e800 R09: ffff88007d3d2ce8
[  610.544018] R10: ffff88007d3d2800 R11: 0000000000000006 R12: ffffffff814a6352
[  610.544018] R13: ffff88007d3d2c48 R14: 0000000000000000 R15: ffff88003789fbb8
[  610.544018] FS:  00007f22855f7700(0000) GS:ffff880001a00000(0000)
knlGS:0000000000000000
[  610.544018] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  610.544018] CR2: 0000000000000028 CR3: 000000007d4df000 CR4: 00000000000006f0
[  610.544018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  610.544018] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  610.544018] Process ip (pid: 1359, threadinfo ffff88003789e000,
task ffff88007d2b69f0)
[  610.544018] Stack:
[  610.544018]  000000037f80e800 0000000000000000 0000000000000000
ffffffff81138fbb
[  610.544018] <0> 0000000000000296 ffff88007d3d2c38 ffffffff816698d0
ffffffff8113a980
[  610.544018] <0> ffff88007d3d2c38 ffff88007d3d2c38 ffff88007d3d2800
0000000000000000
[  610.544018] Call Trace:
[  610.544018]  [<ffffffff81138fbb>] ? sysfs_get_dirent+0x24/0x43
[  610.544018]  [<ffffffff8113a980>] ? sysfs_remove_group+0x24/0xcf
[  610.544018]  [<ffffffff8120f776>] ? device_del+0x3b/0x1a0
[  610.544018]  [<ffffffff81244139>] ? rollback_registered_many+0x15d/0x1c8
[  610.544018]  [<ffffffff8124d81e>] ? rtnetlink_rcv_msg+0x0/0x1f5
[  610.544018]  [<ffffffff81244273>] ? unregister_netdevice_queue+0x78/0xa9
[  610.544018]  [<ffffffff8124c22b>] ? rtnl_dellink+0xb7/0xdb
[  610.544018]  [<ffffffff8125e887>] ? netlink_rcv_skb+0x34/0x7c
[  610.544018]  [<ffffffff8124d818>] ? rtnetlink_rcv+0x1f/0x25
[  610.544018]  [<ffffffff8125e67b>] ? netlink_unicast+0xe2/0x148
[  610.544018]  [<ffffffff8125edaf>] ? netlink_sendmsg+0x23f/0x252
[  610.544018]  [<ffffffff8123388d>] ? sock_sendmsg+0x83/0x9b
[  610.544018]  [<ffffffff810b3e0d>] ? __alloc_pages_nodemask+0x10f/0x5e2
[  610.544018]  [<ffffffff8123c76f>] ? copy_from_user+0x13/0x25
[  610.544018]  [<ffffffff8123cb25>] ? verify_iovec+0x49/0x84
[  610.544018]  [<ffffffff81233b62>] ? sys_sendmsg+0x225/0x2af
[  610.544018]  [<ffffffff81234e17>] ? sys_recvmsg+0x48/0x56
[  610.544018]  [<ffffffff81008ac2>] ? system_call_fastpath+0x16/0x1b
[  610.544018] Code: fb 74 1a 8b 07 85 c0 75 11 be 9d 00 00 00 48 c7
c7 da 56 4b 81 e8 2b cf f0 ff f0 ff 03 48 89 d8 5b c3 55 48 89 f5 53
48 83 ec 08 <48> 8b 5f 28 eb 14 48 8b 7b 18 48 89 ee e8 46 a4 04 00 85
c0 74
[  610.544018] RIP  [<ffffffff81138f71>] sysfs_find_dirent+0x9/0x2f
[  610.544018]  RSP <ffff88003789f988>
[  610.544018] CR2: 0000000000000028
[  610.597008] ---[ end trace f92104e98ea87a47 ]---


To reproduce:

$ cat netnsoops2.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#define _GNU_SOURCE
#include <sched.h>

int main(int argc, char *argv[])
{
       int pipefd[2];
       pid_t pid;
       unsigned long flags = CLONE_NEWNET;

       if(system("ip link add name FOO type dummy"))
	       return 1;
       if(pipe(pipefd) == -1) {
               perror("pipe");
               return 1;
       }
       pid = fork();
       if(pid == -1) {
               perror("fork");
               return 1;
       }
       if(pid) {
	       char buf[256];
	       read(pipefd[0], buf, 1);
	       snprintf(buf, sizeof(buf), "ip link set FOO netns %d", pid);
	       system(buf);
       } else {
	       if(unshare(flags) == -1) {
		       perror("unshare");
		       return 1;
	       }
	       write(pipefd[1], "a", 1);
	       system("ip link set FOO netns 1");
	       return 0;
       }
       waitpid(pid, NULL, 0);
       system("ip link show");
       system("ip link del FOO");
       return 0;
}


-- 
Martín Ferrari

^ permalink raw reply

* Re: Fwd: LVS on local node
From: Simon Horman @ 2010-07-22 12:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Franchoze Eric, wensong, lvs-devel, netdev, netfilter-devel
In-Reply-To: <1279781811.2405.15.camel@edumazet-laptop>

On Thu, Jul 22, 2010 at 08:56:51AM +0200, Eric Dumazet wrote:

[snip]

> lvs seems not very SMP friendly and a bit complex.

I'd be interested to hear some thoughts on
how the SMP aspect of that statement could
be improved.

^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Johannes Berg @ 2010-07-22 11:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <m1sk3bycxl.fsf@fess.ebiederm.org>

On Thu, 2010-07-22 at 04:27 -0700, Eric W. Biederman wrote:

> >> Do we have a convenient command line tool to do this?
> >> I remember there being a different netlink message from
> >> normal network devices.
> >
> > iw phy0 set netns <pid>
> >
> > http://git.sipsolutions.net/iw.git
> >
> >> > root@kvm:~# ip link
> >> > 3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
> >> >     link/ether 02:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> >> > 7: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN 
> >> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >> > root@kvm:~# ls /sys/class/net/
> >> > eth0  hwsim0  lo  wlan1  wlan2
> >> 
> >> I think this is actually the output of something working.
> >> 
> >> I expect after you created a new netns you didn't mount
> >> a new instance of /sys.  /sys remembers which netns you
> >> had when you mounted it.  So you have to mount /sys again
> >> so you can see the /sys/class/net for the network namespace
> >> you are in.
> >
> > Ohh, oops! I saw all the "current->" references in the code and somehow
> > expected the same instance of sysfs to show the right thing.
> >
> > Yes, it works now. But the patch below doesn't seem to work, am I
> > missing something?
> 
> You are trying to move the phy devices as well?

Yes. The intent is that each wireless phy lives in a netns along with
all of its child devices.

> My guess is that at least part of the problem is that you don't have a
> ieee80211 directory under hwsim.

But I should have? 'ieee80211' is a class just like 'net', no?

> My apologies for not thinking about the peculiarities of the wireless
> drivers.

No worries.

johannes


^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Eric W. Biederman @ 2010-07-22 11:27 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <1279795286.12439.8.camel@jlt3.sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

> On Thu, 2010-07-22 at 03:35 -0700, Eric W. Biederman wrote:
>
>> >> The warning patch just makes things fail faster.  Although I get some of the
>> >> wireless interfaces for hwsim when I use this one.
>> >
>> > Hmm, I didn't.
>> 
>> To be clear I just get hwsim0.  Not wlan0 or wlan1.
>
> Ah, yes, but that's just a regular netdev, you can pretty much ignore
> it. It just shows all hwsim traffic as it is on the "air" for sniffing.
>
>> > Right, it actually starts working again with that patch you sent.
>> > However, netns support is really broken:
>> >
>> > <create net namespace, put phy0/wlan0 into it>
>> 
>> Do we have a convenient command line tool to do this?
>> I remember there being a different netlink message from
>> normal network devices.
>
> iw phy0 set netns <pid>
>
> http://git.sipsolutions.net/iw.git
>
>> > root@kvm:~# ip link
>> > 3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
>> >     link/ether 02:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
>> > 7: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN 
>> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> > root@kvm:~# ls /sys/class/net/
>> > eth0  hwsim0  lo  wlan1  wlan2
>> 
>> I think this is actually the output of something working.
>> 
>> I expect after you created a new netns you didn't mount
>> a new instance of /sys.  /sys remembers which netns you
>> had when you mounted it.  So you have to mount /sys again
>> so you can see the /sys/class/net for the network namespace
>> you are in.
>
> Ohh, oops! I saw all the "current->" references in the code and somehow
> expected the same instance of sysfs to show the right thing.
>
> Yes, it works now. But the patch below doesn't seem to work, am I
> missing something?

You are trying to move the phy devices as well?

My guess is that at least part of the problem is that you don't have a
ieee80211 directory under hwsim.

My apologies for not thinking about the peculiarities of the wireless
drivers.

Eric


> ---
>  include/linux/netdevice.h |    2 ++
>  net/core/net-sysfs.c      |    3 ++-
>  net/wireless/sysfs.c      |    9 +++++++++
>  3 files changed, 13 insertions(+), 1 deletion(-)
>
> --- wireless-testing.orig/include/linux/netdevice.h	2010-07-22 10:01:22.000000000 +0200
> +++ wireless-testing/include/linux/netdevice.h	2010-07-22 10:11:00.000000000 +0200
> @@ -2148,6 +2148,8 @@ extern void dev_seq_stop(struct seq_file
>  extern int netdev_class_create_file(struct class_attribute *class_attr);
>  extern void netdev_class_remove_file(struct class_attribute *class_attr);
>  
> +extern struct kobj_ns_type_operations net_ns_type_operations;
> +
>  extern char *netdev_drivername(const struct net_device *dev, char *buffer, int len);
>  
>  extern void linkwatch_run_queue(void);
> --- wireless-testing.orig/net/core/net-sysfs.c	2010-07-22 10:01:22.000000000 +0200
> +++ wireless-testing/net/core/net-sysfs.c	2010-07-22 10:11:51.000000000 +0200
> @@ -785,12 +785,13 @@ static const void *net_netlink_ns(struct
>  	return sock_net(sk);
>  }
>  
> -static struct kobj_ns_type_operations net_ns_type_operations = {
> +struct kobj_ns_type_operations net_ns_type_operations = {
>  	.type = KOBJ_NS_TYPE_NET,
>  	.current_ns = net_current_ns,
>  	.netlink_ns = net_netlink_ns,
>  	.initial_ns = net_initial_ns,
>  };
> +EXPORT_SYMBOL_GPL(net_ns_type_operations);
>  
>  static void net_kobj_ns_exit(struct net *net)
>  {
> --- wireless-testing.orig/net/wireless/sysfs.c	2010-07-22 10:01:22.000000000 +0200
> +++ wireless-testing/net/wireless/sysfs.c	2010-07-22 10:13:08.000000000 +0200
> @@ -110,6 +110,13 @@ static int wiphy_resume(struct device *d
>  	return ret;
>  }
>  
> +static const void *wiphy_namespace(struct device *d)
> +{
> +	struct wiphy *wiphy = container_of(d, struct wiphy, dev);
> +
> +	return wiphy_net(wiphy);
> +}
> +
>  struct class ieee80211_class = {
>  	.name = "ieee80211",
>  	.owner = THIS_MODULE,
> @@ -120,6 +127,8 @@ struct class ieee80211_class = {
>  #endif
>  	.suspend = wiphy_suspend,
>  	.resume = wiphy_resume,
> +	.ns_type = &net_ns_type_operations,
> +	.namespace = wiphy_namespace,
>  };
>  
>  int wiphy_sysfs_init(void)

^ permalink raw reply

* [patch -next] stmmac: handle allocation errors in setup functions
From: Dan Carpenter @ 2010-07-22 11:16 UTC (permalink / raw)
  To: Giuseppe Cavallaro; +Cc: David S. Miller, netdev, kernel-janitors

If the allocations fail in either dwmac1000_setup() or dwmac100_setup()
then return NULL.  These are called from stmmac_mac_device_setup().  The 
check for NULL returns in stmmac_mac_device_setup() needed to be moved 
forward a couple lines.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/drivers/net/stmmac/dwmac1000_core.c b/drivers/net/stmmac/dwmac1000_core.c
index 917b4e1..2b2f5c8 100644
--- a/drivers/net/stmmac/dwmac1000_core.c
+++ b/drivers/net/stmmac/dwmac1000_core.c
@@ -220,6 +220,8 @@ struct mac_device_info *dwmac1000_setup(unsigned long ioaddr)
 		((uid & 0x0000ff00) >> 8), (uid & 0x000000ff));
 
 	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
+	if (!mac)
+		return NULL;
 
 	mac->mac = &dwmac1000_ops;
 	mac->dma = &dwmac1000_dma_ops;
diff --git a/drivers/net/stmmac/dwmac100_core.c b/drivers/net/stmmac/dwmac100_core.c
index 6f270a0..2fb165f 100644
--- a/drivers/net/stmmac/dwmac100_core.c
+++ b/drivers/net/stmmac/dwmac100_core.c
@@ -179,6 +179,8 @@ struct mac_device_info *dwmac100_setup(unsigned long ioaddr)
 	struct mac_device_info *mac;
 
 	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
+	if (!mac)
+		return NULL;
 
 	pr_info("\tDWMAC100\n");
 
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index acf0616..0bdd332 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1558,15 +1558,15 @@ static int stmmac_mac_device_setup(struct net_device *dev)
 	else
 		device = dwmac100_setup(ioaddr);
 
+	if (!device)
+		return -ENOMEM;
+
 	if (priv->enh_desc) {
 		device->desc = &enh_desc_ops;
 		pr_info("\tEnhanced descriptor structure\n");
 	} else
 		device->desc = &ndesc_ops;
 
-	if (!device)
-		return -ENOMEM;
-
 	priv->hw = device;
 
 	priv->wolenabled = priv->hw->pmt;	/* PMT supported */

^ permalink raw reply related

* [patch -next] caif: precedence bug
From: Dan Carpenter @ 2010-07-22 11:11 UTC (permalink / raw)
  To: SjurBraendeland
  Cc: David S. Miller, Sjur Braendeland, netdev, kernel-janitors

Negate has precedence over comparison so the original assert only
checked that "rfml->fragment_size" was larger than 1 or 0.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/net/caif/cfrfml.c b/net/caif/cfrfml.c
index 4b04d25..eb16020 100644
--- a/net/caif/cfrfml.c
+++ b/net/caif/cfrfml.c
@@ -193,7 +193,7 @@ out:
 
 static int cfrfml_transmit_segment(struct cfrfml *rfml, struct cfpkt *pkt)
 {
-	caif_assert(!cfpkt_getlen(pkt) < rfml->fragment_size);
+	caif_assert(cfpkt_getlen(pkt) >= rfml->fragment_size);
 
 	/* Add info for MUX-layer to route the packet out. */
 	cfpkt_info(pkt)->channel_id = rfml->serv.layer.id;

^ permalink raw reply related

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Johannes Berg @ 2010-07-22 10:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <m1mxtjztvy.fsf@fess.ebiederm.org>

On Thu, 2010-07-22 at 03:35 -0700, Eric W. Biederman wrote:

> >> The warning patch just makes things fail faster.  Although I get some of the
> >> wireless interfaces for hwsim when I use this one.
> >
> > Hmm, I didn't.
> 
> To be clear I just get hwsim0.  Not wlan0 or wlan1.

Ah, yes, but that's just a regular netdev, you can pretty much ignore
it. It just shows all hwsim traffic as it is on the "air" for sniffing.

> > Right, it actually starts working again with that patch you sent.
> > However, netns support is really broken:
> >
> > <create net namespace, put phy0/wlan0 into it>
> 
> Do we have a convenient command line tool to do this?
> I remember there being a different netlink message from
> normal network devices.

iw phy0 set netns <pid>

http://git.sipsolutions.net/iw.git

> > root@kvm:~# ip link
> > 3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
> >     link/ether 02:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > 7: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN 
> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > root@kvm:~# ls /sys/class/net/
> > eth0  hwsim0  lo  wlan1  wlan2
> 
> I think this is actually the output of something working.
> 
> I expect after you created a new netns you didn't mount
> a new instance of /sys.  /sys remembers which netns you
> had when you mounted it.  So you have to mount /sys again
> so you can see the /sys/class/net for the network namespace
> you are in.

Ohh, oops! I saw all the "current->" references in the code and somehow
expected the same instance of sysfs to show the right thing.

Yes, it works now. But the patch below doesn't seem to work, am I
missing something?

johannes

---
 include/linux/netdevice.h |    2 ++
 net/core/net-sysfs.c      |    3 ++-
 net/wireless/sysfs.c      |    9 +++++++++
 3 files changed, 13 insertions(+), 1 deletion(-)

--- wireless-testing.orig/include/linux/netdevice.h	2010-07-22 10:01:22.000000000 +0200
+++ wireless-testing/include/linux/netdevice.h	2010-07-22 10:11:00.000000000 +0200
@@ -2148,6 +2148,8 @@ extern void dev_seq_stop(struct seq_file
 extern int netdev_class_create_file(struct class_attribute *class_attr);
 extern void netdev_class_remove_file(struct class_attribute *class_attr);
 
+extern struct kobj_ns_type_operations net_ns_type_operations;
+
 extern char *netdev_drivername(const struct net_device *dev, char *buffer, int len);
 
 extern void linkwatch_run_queue(void);
--- wireless-testing.orig/net/core/net-sysfs.c	2010-07-22 10:01:22.000000000 +0200
+++ wireless-testing/net/core/net-sysfs.c	2010-07-22 10:11:51.000000000 +0200
@@ -785,12 +785,13 @@ static const void *net_netlink_ns(struct
 	return sock_net(sk);
 }
 
-static struct kobj_ns_type_operations net_ns_type_operations = {
+struct kobj_ns_type_operations net_ns_type_operations = {
 	.type = KOBJ_NS_TYPE_NET,
 	.current_ns = net_current_ns,
 	.netlink_ns = net_netlink_ns,
 	.initial_ns = net_initial_ns,
 };
+EXPORT_SYMBOL_GPL(net_ns_type_operations);
 
 static void net_kobj_ns_exit(struct net *net)
 {
--- wireless-testing.orig/net/wireless/sysfs.c	2010-07-22 10:01:22.000000000 +0200
+++ wireless-testing/net/wireless/sysfs.c	2010-07-22 10:13:08.000000000 +0200
@@ -110,6 +110,13 @@ static int wiphy_resume(struct device *d
 	return ret;
 }
 
+static const void *wiphy_namespace(struct device *d)
+{
+	struct wiphy *wiphy = container_of(d, struct wiphy, dev);
+
+	return wiphy_net(wiphy);
+}
+
 struct class ieee80211_class = {
 	.name = "ieee80211",
 	.owner = THIS_MODULE,
@@ -120,6 +127,8 @@ struct class ieee80211_class = {
 #endif
 	.suspend = wiphy_suspend,
 	.resume = wiphy_resume,
+	.ns_type = &net_ns_type_operations,
+	.namespace = wiphy_namespace,
 };
 
 int wiphy_sysfs_init(void)



^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Eric W. Biederman @ 2010-07-22 10:35 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <1279793435.12439.3.camel@jlt3.sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

> On Thu, 2010-07-22 at 03:05 -0700, Eric W. Biederman wrote:
>
>> >> Detect this problem up front and simply don't create symlinks we won't
>> >> be able to remove later.  This prevents symlink leakage and fails in
>> >> a much clearer and more understandable way.
>> >
>> > Eric, I was looking into sysfs netns support for wireless, and with this
>> > patch applied I just get the warning and no network interfaces.
>> 
>> The warning patch just makes things fail faster.  Although I get some of the
>> wireless interfaces for hwsim when I use this one.
>
> Hmm, I didn't.

To be clear I just get hwsim0.  Not wlan0 or wlan1.

>> > Was there any patch that was supposed to fix hwsim?
>> 
>> - If you have my patches that fix CONFIG_SYSFS_DEPRECATED,
>>   you should find everything works there.
>
> But then I was carrying those two patches too.
>
>> As for a proper fix I have just resent my one liner to
>> drives/base/core.c I can't think of a better option right now.
>> 
>> For hwsim it is arguable, but the behaviour of sysfs for the
>> bluetooth bnep driver is very clearly a 3 year old regression,
>> and the cause is exactly the same.
>
> Right, it actually starts working again with that patch you sent.
> However, netns support is really broken:
>
> <create net namespace, put phy0/wlan0 into it>

Do we have a convenient command line tool to do this?
I remember there being a different netlink message from
normal network devices.

> root@kvm:~# ip link
> 3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
>     link/ether 02:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> 7: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN 
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> root@kvm:~# ls /sys/class/net/
> eth0  hwsim0  lo  wlan1  wlan2

I think this is actually the output of something working.

I expect after you created a new netns you didn't mount
a new instance of /sys.  /sys remembers which netns you
had when you mounted it.  So you have to mount /sys again
so you can see the /sys/class/net for the network namespace
you are in.

Eric


^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Johannes Berg @ 2010-07-22 10:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <m139vb25nz.fsf@fess.ebiederm.org>

On Thu, 2010-07-22 at 03:05 -0700, Eric W. Biederman wrote:

> >> Detect this problem up front and simply don't create symlinks we won't
> >> be able to remove later.  This prevents symlink leakage and fails in
> >> a much clearer and more understandable way.
> >
> > Eric, I was looking into sysfs netns support for wireless, and with this
> > patch applied I just get the warning and no network interfaces.
> 
> The warning patch just makes things fail faster.  Although I get some of the
> wireless interfaces for hwsim when I use this one.

Hmm, I didn't.

> > Was there any patch that was supposed to fix hwsim?
> 
> - If you have my patches that fix CONFIG_SYSFS_DEPRECATED,
>   you should find everything works there.

But then I was carrying those two patches too.

> As for a proper fix I have just resent my one liner to
> drives/base/core.c I can't think of a better option right now.
> 
> For hwsim it is arguable, but the behaviour of sysfs for the
> bluetooth bnep driver is very clearly a 3 year old regression,
> and the cause is exactly the same.

Right, it actually starts working again with that patch you sent.
However, netns support is really broken:

<create net namespace, put phy0/wlan0 into it>

root@kvm:~# ip link
3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
root@kvm:~# ls /sys/class/net/
eth0  hwsim0  lo  wlan1  wlan2

johannes


^ permalink raw reply

* Re: mirred, redirect action vs. dev refcount issue
From: jamal @ 2010-07-22 10:11 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20100721170024.60cd9ef4@nehalam>

On Wed, 2010-07-21 at 17:00 -0700, Stephen Hemminger wrote:
> On Wed, 21 Jul 2010 16:58:02 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 

> > Whether the ifindex or the global list + delete scheme is better is a
> > topic for discussion.  Since from the user's perspective it is unclear
> > which semantic is less surprising, entries disappearing or suddenly
> > stop working (or start applying to a different device which has taken
> > a previous one's ifindex!).
> 
> ifindex is unique (until integer wraps) so that soft reference
> works.

The proper way to do it is via a notifier since we point to the
netdev - and yes it is a little more complex thats why i just
let the admin suffer (IMO) the well deserved consequences[1].

I am in travel mode - but i will do some background thinking and
come up with a good way to resolve it when i get back. Unless you
have a patch you want me to look at.

cheers,
jamal

[1] least element of suprise principle:
Admin adds a rule which says
"you see a packet matching blah incoming on eth0,
do action1 then action2 ... then actionN"
Say action2 is "mirror to ifb0".
And then this same admin goes and rmmods ifb0 - it is easier to
just reject this rmmod operation as we do todau. Maybe we could be 
kinder and be more informative and syslog something along the lines of
"rejected to unregister device you rat-bastard because you have a rule
which says we should mirror to ifb0". 
Thoughts?



^ permalink raw reply

* Re: Fwd: LVS on local node
From: Changli Gao @ 2010-07-22 10:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, Jan Engelhardt, Franchoze Eric, wensong,
	lvs-devel, netdev, netfilter-devel
In-Reply-To: <1279792792.2467.15.camel@edumazet-laptop>

On Thu, Jul 22, 2010 at 5:59 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 22 juillet 2010 à 17:52 +0800, Changli Gao a écrit :
>
>>
>> FYI: the random option is documented in the manual page of iptables.
>>
>>    REDIRECT
>>        This  target is only valid in the nat table, in the PREROUTING and OUT-
>>        PUT chains, and user-defined chains which are only  called  from  those
>>        chains.   It redirects the packet to the machine itself by changing the
>>        destination IP  to  the  primary  address  of  the  incoming  interface
>>        (locally-generated packets are mapped to the 127.0.0.1 address).
>>
>>        --to-ports port[-port]
>>               This  specifies  a  destination  port  or range of ports to use:
>>               without this, the destination port is never  altered.   This  is
>>               only valid if the rule also specifies -p tcp or -p udp.
>>
>>        --random
>>               If  option --random is used then port mapping will be randomized
>>               (kernel >= 2.6.22).
>>
>>
>
> Note my patch has nothing to do with the man page, its already up2date.
>
> I usually dont read the Fine manuals, do you ?

Yea. And I don't object your patch. so I add FYI. Thanks.

>
> Try :
>
> iptables -t nat -A PREROUTING -p tcp --dport 1234 -j REDIRECT --help
>
> REDIRECT target options:
>  --to-ports <port>[-<port>]
>                                Port (range) to map to.
>
>
> You see [--random] is missing.
>
>


-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Eric W. Biederman @ 2010-07-22 10:05 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <1279792459.12439.0.camel@jlt3.sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

> On Thu, 2010-07-08 at 09:31 -0700, Eric W. Biederman wrote:
>> Recently my tagged sysfs support revealed a flaw in the device core
>> that a few rare drivers are running into such that we don't always put
>> network devices in a class subdirectory named net/.
>> 
>> Since we are not creating the class directory the network devices wind
>> up in a non-tagged directory, but the symlinks to the network devices
>> from /sys/class/net are in a tagged directory.  All of which works
>> until we go to remove or rename the symlink.  When we remove or rename
>> a symlink we look in the namespace of the target of the symlink.
>> Since the target of the symlink is in a non-tagged sysfs directory we
>> don't have a namespace to look in, and we fail to remove the symlink.
>> 
>> Detect this problem up front and simply don't create symlinks we won't
>> be able to remove later.  This prevents symlink leakage and fails in
>> a much clearer and more understandable way.
>
> Eric, I was looking into sysfs netns support for wireless, and with this
> patch applied I just get the warning and no network interfaces.

The warning patch just makes things fail faster.  Although I get some of the
wireless interfaces for hwsim when I use this one.

> Was there any patch that was supposed to fix hwsim?

- If you have my patches that fix CONFIG_SYSFS_DEPRECATED,
  you should find everything works there.

As for a proper fix I have just resent my one liner to
drives/base/core.c I can't think of a better option right now.

For hwsim it is arguable, but the behaviour of sysfs for the
bluetooth bnep driver is very clearly a 3 year old regression,
and the cause is exactly the same.

Eric

^ permalink raw reply

* Re: Fwd: LVS on local node
From: Eric Dumazet @ 2010-07-22  9:59 UTC (permalink / raw)
  To: Changli Gao
  Cc: Patrick McHardy, Jan Engelhardt, Franchoze Eric, wensong,
	lvs-devel, netdev, netfilter-devel
In-Reply-To: <AANLkTimLZG_-4wbKy8nrajHVKQvvGtRKEd8qHNoU3swJ@mail.gmail.com>

Le jeudi 22 juillet 2010 à 17:52 +0800, Changli Gao a écrit :

> 
> FYI: the random option is documented in the manual page of iptables.
> 
>    REDIRECT
>        This  target is only valid in the nat table, in the PREROUTING and OUT-
>        PUT chains, and user-defined chains which are only  called  from  those
>        chains.   It redirects the packet to the machine itself by changing the
>        destination IP  to  the  primary  address  of  the  incoming  interface
>        (locally-generated packets are mapped to the 127.0.0.1 address).
> 
>        --to-ports port[-port]
>               This  specifies  a  destination  port  or range of ports to use:
>               without this, the destination port is never  altered.   This  is
>               only valid if the rule also specifies -p tcp or -p udp.
> 
>        --random
>               If  option --random is used then port mapping will be randomized
>               (kernel >= 2.6.22).
> 
> 

Note my patch has nothing to do with the man page, its already up2date.

I usually dont read the Fine manuals, do you ?

Try :

iptables -t nat -A PREROUTING -p tcp --dport 1234 -j REDIRECT --help

REDIRECT target options:
 --to-ports <port>[-<port>]
				Port (range) to map to.


You see [--random] is missing.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] sysfs: Don't allow the creation of symlinks we can't remove
From: Johannes Berg @ 2010-07-22  9:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Andrew Morton, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Greg KH, netdev
In-Reply-To: <m1630q7x5v.fsf_-_@fess.ebiederm.org>

On Thu, 2010-07-08 at 09:31 -0700, Eric W. Biederman wrote:
> Recently my tagged sysfs support revealed a flaw in the device core
> that a few rare drivers are running into such that we don't always put
> network devices in a class subdirectory named net/.
> 
> Since we are not creating the class directory the network devices wind
> up in a non-tagged directory, but the symlinks to the network devices
> from /sys/class/net are in a tagged directory.  All of which works
> until we go to remove or rename the symlink.  When we remove or rename
> a symlink we look in the namespace of the target of the symlink.
> Since the target of the symlink is in a non-tagged sysfs directory we
> don't have a namespace to look in, and we fail to remove the symlink.
> 
> Detect this problem up front and simply don't create symlinks we won't
> be able to remove later.  This prevents symlink leakage and fails in
> a much clearer and more understandable way.

Eric, I was looking into sysfs netns support for wireless, and with this
patch applied I just get the warning and no network interfaces.

Was there any patch that was supposed to fix hwsim?

johannes


^ permalink raw reply

* Re: Fwd: LVS on local node
From: Changli Gao @ 2010-07-22  9:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, Jan Engelhardt, Franchoze Eric, wensong,
	lvs-devel, netdev, netfilter-devel
In-Reply-To: <1279791964.2467.12.camel@edumazet-laptop>

On Thu, Jul 22, 2010 at 5:46 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 22 juillet 2010 à 17:10 +0800, Changli Gao a écrit :
>
>>
>> I think maybe REDIRECT is enough. If the public port is one of the
>> real ports, you need to append "random" option to iptables target
>> REDIRECT. If not, "REDIRECT --to-ports 1000-1007" is good enough, and
>> the destination port will be selected in the round-robin manner.
>>
>
> Yes, on 2.6.32, no RPS, so undocumented --random option is probably the
> best we can offer. (random option was added in 2.6.22)
>
> iptables -t nat -A PREROUTING -p tcp --dport 1234 -j REDIRECT --random --to-port 1000-1007
>
> Here is a patch to add "random" help to REDIRECT iptables target
>

FYI: the random option is documented in the manual page of iptables.

   REDIRECT
       This  target is only valid in the nat table, in the PREROUTING and OUT-
       PUT chains, and user-defined chains which are only  called  from  those
       chains.   It redirects the packet to the machine itself by changing the
       destination IP  to  the  primary  address  of  the  incoming  interface
       (locally-generated packets are mapped to the 127.0.0.1 address).

       --to-ports port[-port]
              This  specifies  a  destination  port  or range of ports to use:
              without this, the destination port is never  altered.   This  is
              only valid if the rule also specifies -p tcp or -p udp.

       --random
              If  option --random is used then port mapping will be randomized
              (kernel >= 2.6.22).


-- 
Regards,
Changli Gao(xiaosuo@gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] nf_nat_core: merge the same lines
From: Changli Gao @ 2010-07-22  9:49 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, netfilter-devel, netdev, Changli Gao

nf_nat_core: merge the same lines

proto->unique_tuple() will be called finally, if the previous calls fail. This
patch checks the false condition of (range->flags & IP_NAT_RANGE_PROTO_RANDOM)
instead to avoid duplicate line of code: proto->unique_tuple().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/ipv4/netfilter/nf_nat_core.c |    9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c
index c7719b2..037a3a6 100644
--- a/net/ipv4/netfilter/nf_nat_core.c
+++ b/net/ipv4/netfilter/nf_nat_core.c
@@ -261,14 +261,9 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple,
 	rcu_read_lock();
 	proto = __nf_nat_proto_find(orig_tuple->dst.protonum);
 
-	/* Change protocol info to have some randomization */
-	if (range->flags & IP_NAT_RANGE_PROTO_RANDOM) {
-		proto->unique_tuple(tuple, range, maniptype, ct);
-		goto out;
-	}
-
 	/* Only bother mapping if it's not already in range and unique */
-	if ((!(range->flags & IP_NAT_RANGE_PROTO_SPECIFIED) ||
+	if (!(range->flags & IP_NAT_RANGE_PROTO_RANDOM) &&
+	    (!(range->flags & IP_NAT_RANGE_PROTO_SPECIFIED) ||
 	     proto->in_range(tuple, maniptype, &range->min, &range->max)) &&
 	    !nf_nat_used_tuple(tuple, ct))
 		goto out;

^ permalink raw reply related

* Re: Fwd: LVS on local node
From: Eric Dumazet @ 2010-07-22  9:46 UTC (permalink / raw)
  To: Changli Gao, Patrick McHardy, Jan Engelhardt
  Cc: Franchoze Eric, wensong, lvs-devel, netdev, netfilter-devel
In-Reply-To: <AANLkTimneLE2xKg6fie5u3Cvjzcrq4VnK6wMT3pno0JK@mail.gmail.com>

Le jeudi 22 juillet 2010 à 17:10 +0800, Changli Gao a écrit :

> 
> I think maybe REDIRECT is enough. If the public port is one of the
> real ports, you need to append "random" option to iptables target
> REDIRECT. If not, "REDIRECT --to-ports 1000-1007" is good enough, and
> the destination port will be selected in the round-robin manner.
> 

Yes, on 2.6.32, no RPS, so undocumented --random option is probably the
best we can offer. (random option was added in 2.6.22)

iptables -t nat -A PREROUTING -p tcp --dport 1234 -j REDIRECT --random --to-port 1000-1007

Here is a patch to add "random" help to REDIRECT iptables target

Thanks

[PATCH] extensions: REDIRECT: add random help

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---

diff --git a/extensions/libipt_REDIRECT.c b/extensions/libipt_REDIRECT.c
index 3dfcadf..324d0eb 100644
--- a/extensions/libipt_REDIRECT.c
+++ b/extensions/libipt_REDIRECT.c
@@ -17,7 +17,8 @@ static void REDIRECT_help(void)
 	printf(
 "REDIRECT target options:\n"
 " --to-ports <port>[-<port>]\n"
-"				Port (range) to map to.\n");
+"				Port (range) to map to.\n"
+" [--random]\n");
 }
 
 static const struct option REDIRECT_opts[] = {


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Wolfgang Grandegger @ 2010-07-22  9:44 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4C481064.8090809-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

On 07/22/2010 11:33 AM, Marc Kleine-Budde wrote:
> Wolfgang Grandegger wrote:
>> On 07/21/2010 10:42 PM, Marc Kleine-Budde wrote:
>>> Marc Kleine-Budde wrote:
>>>> Wolfgang Grandegger wrote:
>>>>> I realized a few issues. You can add my "acked-by" when they are fixed.
>>>> thanks for the review.
>>> [...]
>>>
>>>>>> +static void flexcan_poll_err_frame(struct net_device *dev,
>>>>>> +				   struct can_frame *cf, u32 reg_esr)
>>>>>> +{
>>>>>> +	struct flexcan_priv *priv = netdev_priv(dev);
>>>>>> +	int error_warning = 0, rx_errors = 0, tx_errors = 0;
>>>>>> +
>>>>>> +	if (reg_esr & FLEXCAN_ESR_BIT1_ERR) {
>>>>>> +		rx_errors = 1;
>>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT1;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (reg_esr & FLEXCAN_ESR_BIT0_ERR) {
>>>>>> +		rx_errors = 1;
>>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT0;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (reg_esr & FLEXCAN_ESR_FRM_ERR) {
>>>>>> +		rx_errors = 1;
>>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>>> +		cf->data[2] |= CAN_ERR_PROT_FORM;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (reg_esr & FLEXCAN_ESR_STF_ERR) {
>>>>>> +		rx_errors = 1;
>>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>>> +		cf->data[2] |= CAN_ERR_PROT_STUFF;
>>>>>> +	}
>>>>>> +
>>>>>> +
>>>>>> +	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>>>>> +		tx_errors = 1;
>>>>>> +		cf->can_id |= CAN_ERR_ACK;
>>>>> This is a bus-error as well. Therefore I think it should be:
>>>>>
>>>>> 	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>>>> 		tx_errors = 1;
>>>>> 		cf->can_id |= CAN_ERR_ACK;
>>>>> 		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>> 		cf->data[3] |= CAN_ERR_PROT_LOC_ACK; /* ACK slot */
>>>>> 	}
>>>>>
>>>>> I need to check what CAN_ERR_ACK is intended for. Then, cf->can_id could
>>>>> be preset with "CAN_ERR_PROT | CAN_ERR_BUSERROR" at the beginning.
>>> This controller issues an ACK error if there are no other nodes on the
>>> CAN bus to send a ACK that the message has been received. Or all
>>> remaining Nodes are in bus off state.
>>
>> Mainly this bus error can cause the infamous IRQ and message flooding
>> when no cable is connected.
> 
> No cable connected can (if your node doesn't have an activated on baord
> termination) result in no termination at all, and this may result in a
> different error.
> 
> At least it does on the at91. I haven't checked with the flexcan.
> 
> The subtile difference is that the CAN controller isn't allowed to go
> into bus-off with a proper terminated bus when it recevies no ACKs, but
> going to bus off on a not terminated bus is okay.
> 
>>> From the datasheet:
>>> "This bit indicates that an acknowledge (ACK) error has been detected by
>>> the transmitter node; that is, a dominant bit has not been detected
>>> during the ACK SLOT."
>>
>> That's what the above error describes, like on the SJA1000, apart from
>> setting CAN_ERR_ACK. Setting CAN_ERR_ACK is somehow bogus, but just
>> leave it for the time being. I will fix it globally when time permits.
> 
> Now I'm confused. What's the meaning of CAN_ERR_ACK? When should it be used?

The type of the error is already defined via "CAN_ERR_PROT |
CAN_ERR_BUSERROR". The details of "CAN_ERR_PROT" are described in
data[2..3]. Just for the ACK errors we have CAN_ERR_ACK, but not for the
other bus errors and I ask myself why CAN_ERR_ACK was introduced.
If it does not have another meaning, I tend to remove it (only the AT91
driver actually uses it).

Wolfgang.

^ permalink raw reply

* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Marc Kleine-Budde @ 2010-07-22  9:33 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4C480E7E.70607-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 3381 bytes --]

Wolfgang Grandegger wrote:
> On 07/21/2010 10:42 PM, Marc Kleine-Budde wrote:
>> Marc Kleine-Budde wrote:
>>> Wolfgang Grandegger wrote:
>>>> I realized a few issues. You can add my "acked-by" when they are fixed.
>>> thanks for the review.
>> [...]
>>
>>>>> +static void flexcan_poll_err_frame(struct net_device *dev,
>>>>> +				   struct can_frame *cf, u32 reg_esr)
>>>>> +{
>>>>> +	struct flexcan_priv *priv = netdev_priv(dev);
>>>>> +	int error_warning = 0, rx_errors = 0, tx_errors = 0;
>>>>> +
>>>>> +	if (reg_esr & FLEXCAN_ESR_BIT1_ERR) {
>>>>> +		rx_errors = 1;
>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT1;
>>>>> +	}
>>>>> +
>>>>> +	if (reg_esr & FLEXCAN_ESR_BIT0_ERR) {
>>>>> +		rx_errors = 1;
>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT0;
>>>>> +	}
>>>>> +
>>>>> +	if (reg_esr & FLEXCAN_ESR_FRM_ERR) {
>>>>> +		rx_errors = 1;
>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>> +		cf->data[2] |= CAN_ERR_PROT_FORM;
>>>>> +	}
>>>>> +
>>>>> +	if (reg_esr & FLEXCAN_ESR_STF_ERR) {
>>>>> +		rx_errors = 1;
>>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>>> +		cf->data[2] |= CAN_ERR_PROT_STUFF;
>>>>> +	}
>>>>> +
>>>>> +
>>>>> +	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>>>> +		tx_errors = 1;
>>>>> +		cf->can_id |= CAN_ERR_ACK;
>>>> This is a bus-error as well. Therefore I think it should be:
>>>>
>>>> 	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>>> 		tx_errors = 1;
>>>> 		cf->can_id |= CAN_ERR_ACK;
>>>> 		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>> 		cf->data[3] |= CAN_ERR_PROT_LOC_ACK; /* ACK slot */
>>>> 	}
>>>>
>>>> I need to check what CAN_ERR_ACK is intended for. Then, cf->can_id could
>>>> be preset with "CAN_ERR_PROT | CAN_ERR_BUSERROR" at the beginning.
>> This controller issues an ACK error if there are no other nodes on the
>> CAN bus to send a ACK that the message has been received. Or all
>> remaining Nodes are in bus off state.
> 
> Mainly this bus error can cause the infamous IRQ and message flooding
> when no cable is connected.

No cable connected can (if your node doesn't have an activated on baord
termination) result in no termination at all, and this may result in a
different error.

At least it does on the at91. I haven't checked with the flexcan.

The subtile difference is that the CAN controller isn't allowed to go
into bus-off with a proper terminated bus when it recevies no ACKs, but
going to bus off on a not terminated bus is okay.

>> From the datasheet:
>> "This bit indicates that an acknowledge (ACK) error has been detected by
>> the transmitter node; that is, a dominant bit has not been detected
>> during the ACK SLOT."
> 
> That's what the above error describes, like on the SJA1000, apart from
> setting CAN_ERR_ACK. Setting CAN_ERR_ACK is somehow bogus, but just
> leave it for the time being. I will fix it globally when time permits.

Now I'm confused. What's the meaning of CAN_ERR_ACK? When should it be used?

Cheers, Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* Re: macvtap: Limit packet queue length
From: Arnd Bergmann @ 2010-07-22  9:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David S. Miller, netdev, Mark Wagner, Chris Wright
In-Reply-To: <20100722074431.GA26744@gondor.apana.org.au>

On Thursday 22 July 2010, Herbert Xu wrote:
> On Thu, Jul 22, 2010 at 02:41:57PM +0800, Herbert Xu wrote:
> > Hi:
> > 
> > macvtap: Limit packet queue length
> 
> Chris has informed me that he's already tried a similar patch
> and it only makes the problem worse :)
> 
> The issue is that the macvtap TX queue length defaults to zero.
> 
> So here is an updated patch which addresses this:

Thanks for debugging this and coming up with a solution.
I'm currently travelling, so I can't easily work on it myself.

> Please note that macvtap currently has no way of giving congestion
> notification, that means the software device TX queue cannot be
> used and packets will always be dropped once the macvtap driver
> queue fills up.

This is something I was planning to look into for doing it right,
and then I forgot about it. I'll investigate what could be done
to get proper flow control once I get back to the office.

> Chris Wright noticed that for this patch to work, we need a
> non-zero TX queue length.  This patch includes his work to change
> the default macvtap TX queue length to 500.

The only problem I can see with this patch is making it depend on
the *TX* queue length. The point is that unlike tun/tap, the
macvtap network interface's point of view is that this is the
receive queue, not the transmit queue.

In the TX direction, we really don't queue, since we simply forward
to the lowerdev tx queue, so exposing the tunable to user space
as the tx queue length is a little bit awkward, as well as inconsistent
between macvtap and macvlan.

> Reported-by: Mark Wagner <mwagner@redhat.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

As long as we're missing a better solution,

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply

* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Wolfgang Grandegger @ 2010-07-22  9:25 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4C475BA6.3030505-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

On 07/21/2010 10:42 PM, Marc Kleine-Budde wrote:
> Marc Kleine-Budde wrote:
>> Wolfgang Grandegger wrote:
>>> I realized a few issues. You can add my "acked-by" when they are fixed.
>>
>> thanks for the review.
> 
> [...]
> 
>>>> +static void flexcan_poll_err_frame(struct net_device *dev,
>>>> +				   struct can_frame *cf, u32 reg_esr)
>>>> +{
>>>> +	struct flexcan_priv *priv = netdev_priv(dev);
>>>> +	int error_warning = 0, rx_errors = 0, tx_errors = 0;
>>>> +
>>>> +	if (reg_esr & FLEXCAN_ESR_BIT1_ERR) {
>>>> +		rx_errors = 1;
>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT1;
>>>> +	}
>>>> +
>>>> +	if (reg_esr & FLEXCAN_ESR_BIT0_ERR) {
>>>> +		rx_errors = 1;
>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>> +		cf->data[2] |= CAN_ERR_PROT_BIT0;
>>>> +	}
>>>> +
>>>> +	if (reg_esr & FLEXCAN_ESR_FRM_ERR) {
>>>> +		rx_errors = 1;
>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>> +		cf->data[2] |= CAN_ERR_PROT_FORM;
>>>> +	}
>>>> +
>>>> +	if (reg_esr & FLEXCAN_ESR_STF_ERR) {
>>>> +		rx_errors = 1;
>>>> +		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>>> +		cf->data[2] |= CAN_ERR_PROT_STUFF;
>>>> +	}
>>>> +
>>>> +
>>>> +	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>>> +		tx_errors = 1;
>>>> +		cf->can_id |= CAN_ERR_ACK;
>>> This is a bus-error as well. Therefore I think it should be:
>>>
>>> 	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
>>> 		tx_errors = 1;
>>> 		cf->can_id |= CAN_ERR_ACK;
>>> 		cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
>>> 		cf->data[3] |= CAN_ERR_PROT_LOC_ACK; /* ACK slot */
>>> 	}
>>>
>>> I need to check what CAN_ERR_ACK is intended for. Then, cf->can_id could
>>> be preset with "CAN_ERR_PROT | CAN_ERR_BUSERROR" at the beginning.
> 
> This controller issues an ACK error if there are no other nodes on the
> CAN bus to send a ACK that the message has been received. Or all
> remaining Nodes are in bus off state.

Mainly this bus error can cause the infamous IRQ and message flooding
when no cable is connected.

> From the datasheet:
> "This bit indicates that an acknowledge (ACK) error has been detected by
> the transmitter node; that is, a dominant bit has not been detected
> during the ACK SLOT."

That's what the above error describes, like on the SJA1000, apart from
setting CAN_ERR_ACK. Setting CAN_ERR_ACK is somehow bogus, but just
leave it for the time being. I will fix it globally when time permits.

Wolfgang.

^ permalink raw reply

* [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Eric W. Biederman @ 2010-07-22  9:16 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Greg KH, Rafael J. Wysocki, Maciej W. Rozycki,
	Kay Sievers, Johannes Berg, netdev
In-Reply-To: <m139vd5sms.fsf_-_@fess.ebiederm.org>


With CONFIG_SYSFS_DEPRECATED_V2 enabled I can rename any network device
anything as long as the new name does not conflict with another network
device.

With CONFIG_SYSFS_DEPRECATED_V2 disabled without this fix bluetooth benp
devices, and the mac80211_hwsim driver can not be renamed to any arbitrary
name that happens to conflict with any other name that is used in their
parent devices directory.

The device model usage of the bluetooth bnep driver has not changed since
the current device model sysfs laytout was introduced.   Making this a latent
and very annoying regression.

This regression was reported at the time it was introduced and apprently a
few cases were missed by:

commit 864062457a2e444227bd368ca5f2a2b740de604f
Author: Kay Sievers <kay.sievers@vrfy.org>
Date:   Wed Mar 14 03:25:56 2007 +0100

    driver core: fix namespace issue with devices assigned to classes

      - uses a kset in "struct class" to keep track of all directories
        belonging to this class
      - merges with the /sys/devices/virtual logic.
      - removes the namespace-dir if the last member of that class
        leaves the directory.

    There may be locking or refcounting fixes left, I stopped when it seemed
    to work with network and sound modules. :)

    From: Kay Sievers <kay.sievers@vrfy.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

That bug fix had a completely undocumented and apparently deliberate omission
where it does not apply in some cases.  Those omitted cases cover the
bluetooth network driver.

What makes this regression a serious issue now is the introduction of network
namespace support in sysfs took this from a mild bug to complete driver
non-function.

Several reasons have been put forward not to use this one line bug fix in
the driver core.

- We don't have special cases in the driver core.

  This is non-sense we currently have special cases for block devices
  scattered all throughout the driver core.

- The driver is at fault.

  The bluetooth driver's driver core usage predates the introduction of the
  current sysfs layout.  It has been 3 years and no one has bothered
  to change the driver in all of this time.

  If this was really a driver issue that someone cared about and not simply
  an academic issue the driver should have been fixed long since.

I offer these reasons to make the change.

- There is not enough information available for sysfs to support rename
  or delete of symlinks when only the source directory but not the destination
  directory is tagged.

- All of the proposals put forth will change the sysfs layout slightly in
  one way or another.

- A network device special case makes sense as network devices are unique
  in placing a user choosen name sysfs.

- The driver that are affected are different in sysfs from all other
  network devices which likely already consuses any userspace software
  that cares about the exact device layout.

- The change is one line that is obviously correct and has no chance
  of affecting anything that it is not a network device.

- Real users have hit this problem and reported this bug and those users
  deserve drivers that work.

The fix is a trivial change to get_device_parent to always create
the network class directory in any parent of a network device,
Ignoring the incorrect, non-obvious and esoteric rules that the
driver core is trying to impose on the creation of class directories.

Ideally I would remove those crazy special cases get_device_parent for all
devices but in testing it was observed there are other devices such as the
input layer that don't create the class directories today, and applying the
change broadly is likely to break something in userspace and there is no
need to take that risk.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 drivers/base/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 9630fbd..ffb8443 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -673,7 +673,7 @@ static struct kobject *get_device_parent(struct device *dev,
 		 */
 		if (parent == NULL)
 			parent_kobj = virtual_device_parent(dev);
-		else if (parent->class)
+		else if (parent->class && (strcmp(dev->class->name, "net") != 0))
 			return &parent->kobj;
 		else
 			parent_kobj = &parent->kobj;
-- 
1.6.5.2.143.g8cc62


^ permalink raw reply related

* Re: Fwd: LVS on local node
From: Changli Gao @ 2010-07-22  9:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Franchoze Eric, wensong, lvs-devel, netdev, netfilter-devel
In-Reply-To: <1279781811.2405.15.camel@edumazet-laptop>

On Thu, Jul 22, 2010 at 2:56 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> lvs seems not very SMP friendly and a bit complex.
>
> I would use an iptables setup and a slighly modified REDIRECT target
> (and/or a nf_nat_setup_info() change)
>
> Say you have 8 daemons listening on different ports (1000 to 1007)
>
> iptables -t nat -A PREROUTING -p tcp --dport 1234 -j REDIRECT --rxhash-dist --to-port 1000-1007
>
> rxhash would be provided by RPS on recent kernels or locally computed if
> not already provided by core network (or old kernel)
>
> This rule would be triggered only at connection establishment.
> conntracking take care of following packets and is SMP friendly.
>
>

I think maybe REDIRECT is enough. If the public port is one of the
real ports, you need to append "random" option to iptables target
REDIRECT. If not, "REDIRECT --to-ports 1000-1007" is good enough, and
the destination port will be selected in the round-robin manner.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox