* Off-by-one error in net/8021q/vlan.c
From: Phil Karn @ 2011-02-16 10:58 UTC (permalink / raw)
To: kaber; +Cc: netdev
The range check on vlan_id in register_vlan_device is off by one, and it
prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
this, I checked.)
Here's the trivial patch:
--- linux-2.6.37/net/8021q/vlan.c~ 2011-01-04 16:50:19.000000000 -0800
+++ linux-2.6.37/net/8021q/vlan.c 2011-02-16 02:43:13.988812958 -0800
@@ -239,7 +239,7 @@
char name[IFNAMSIZ];
int err;
- if (vlan_id >= VLAN_VID_MASK)
+ if (vlan_id > VLAN_VID_MASK)
return -ERANGE;
err = vlan_check_real_dev(real_dev, vlan_id);
^ permalink raw reply
* Re: [PATCH 1/1] tproxy: do not assign timewait sockets to skb->sk
From: Florian Westphal @ 2011-02-16 11:30 UTC (permalink / raw)
To: KOVACS Krisztian
Cc: Patrick McHardy, netfilter-devel, netdev, Balazs Scheidler
In-Reply-To: <4D5B90C7.5040603@balabit.hu>
KOVACS Krisztian <hidden@balabit.hu> wrote:
> On 02/14/2011 04:51 PM, Patrick McHardy wrote:
> >Am 14.02.2011 12:44, schrieb Florian Westphal:
> >>Assigning a socket in timewait state to skb->sk can trigger
> >>kernel oops, e.g. in nfnetlink_log, which does:
> >>
> >>if (skb->sk) {
> >> read_lock_bh(&skb->sk->sk_callback_lock);
> >> if (skb->sk->sk_socket&& skb->sk->sk_socket->file) ...
> >>
> >>in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
> >>is invalid.
> >>
> >>Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
> >>or xt_TPROXY must not assign a timewait socket to skb->sk.
> >>
> >>This does the latter.
> >>
> >>If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
> >>thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
> >>
> >>The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
> >>listener socket.
> >>
> >>Cc: Balazs Scheidler<bazsi@balabit.hu>
> >>Cc: KOVACS Krisztian<hidden@balabit.hu>
> >>Signed-off-by: Florian Westphal<fwestphal@astaro.com>
> >
> >Looks fine to me. Balazs. Krisztian, any objections?
>
> Seems to be OK, as far as I can see.
>
> Florian, did you make sure the tests still run after applying this patch?
>
> http://git.balabit.hu/?p=bazsi/tproxy-test.git;a=summary
Thanks for the hint, I cloned this and ran it on my test setup:
./tproxy-test.py
[..]
PASS: ('192.168.10.8', 50080), we got a connection as we deserved
PASS: everything is fine
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: richard -rw- weinberger @ 2011-02-16 12:51 UTC (permalink / raw)
To: Phil Karn; +Cc: kaber, netdev
In-Reply-To: <4D5BADCF.5000804@ka9q.net>
On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
> The range check on vlan_id in register_vlan_device is off by one, and it
> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
> this, I checked.)
Then OSX should fix their code. 4095 is reserved.
//richard
> Here's the trivial patch:
>
> --- linux-2.6.37/net/8021q/vlan.c~ 2011-01-04 16:50:19.000000000 -0800
> +++ linux-2.6.37/net/8021q/vlan.c 2011-02-16 02:43:13.988812958 -0800
> @@ -239,7 +239,7 @@
> char name[IFNAMSIZ];
> int err;
>
> - if (vlan_id >= VLAN_VID_MASK)
> + if (vlan_id > VLAN_VID_MASK)
> return -ERANGE;
>
> err = vlan_check_real_dev(real_dev, vlan_id);
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Thanks,
//richard
^ permalink raw reply
* NFS on little-endian platform - Microblaze
From: Michal Simek @ 2011-02-16 13:09 UTC (permalink / raw)
To: netdev; +Cc: David Miller
Hi All,
I am trying to understand one problem which we have found.
The problem is that I can't on Microblaze little-endian platform
mount nfs without -o nolock options. (Log below)
Selecting tcp or udp has no effect.
I am using emaclite driver and there is no problem on big endian microblaze.
ping, telnet, http, ftp, iperf, netperf work well.
That's why I have a question if there is any endian specific option for NFS?
Thanks,
Michal
~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
svc: failed to register lockdv1 RPC service (errno 13).
lockd_up: makesock failed, error=-13
svc: failed to register lockdv1 RPC service (errno 13).
~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
~ # mount
rootfs on / type rootfs (rw)
proc on /proc type proc (rw,relatime)
none on /var type ramfs (rw,relatime)
none on /sys type sysfs (rw,relatime)
none on /etc/config type ramfs (rw,relatime)
none on /dev/pts type devpts (rw,relatime,mode=600)
192.168.0.101:/tftpboot/nfs on /mnt type nfs
(rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
~ #
~ # ps
PID USER TIME COMMAND
1 root 0:02 init
2 root 0:00 [kthreadd]
3 root 0:00 [ksoftirqd/0]
4 root 0:00 [kworker/0:0]
5 root 0:00 [kworker/u:0]
6 root 0:00 [khelper]
7 root 0:00 [sync_supers]
8 root 0:00 [bdi-default]
9 root 0:00 [kblockd]
10 root 0:00 [rpciod]
11 root 0:00 [kworker/0:1]
12 root 0:00 [kswapd0]
13 root 0:00 [fsnotify_mark]
14 root 0:00 [aio]
15 root 0:00 [nfsiod]
16 root 0:00 [kworker/u:1]
58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
62 1 0:00 /bin/portmap
64 root 0:00 /bin/inetd /etc/inetd.conf
65 root 0:01 -sh
66 root 0:00 /bin/syslogd -n
67 root 0:00 /bin/flatfsd
68 root 0:00 [kworker/0:2]
91 root 0:00 ps
~ # cat /proc/cpuinfo
CPU-Family: MicroBlaze
FPGA-Arch: spartan6
CPU-Ver: 8.00.a, little endian
CPU-MHz: 50.00
BogoMips: 24.06
HW:
Shift: yes
MSR: yes
PCMP: yes
DIV: yes
MMU: 3
MUL: v2
FPU: no
Exc: op0x0 unal ill iopb dopb zero
Icache: 16kB line length: 32B
Dcache: 16kB line length: 16B
write-through
HW-Debug: yes
PVR-USR1: 00
PVR-USR2: 00000000
Page size: 4096
~ #
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
^ permalink raw reply
* Re: NFS on little-endian platform - Microblaze
From: Michal Simek @ 2011-02-16 13:16 UTC (permalink / raw)
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, David Miller,
linux-nfs-u79uwXL29TY76Z2rM5mHXA,
Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA
In-Reply-To: <4D5BCC74.9010301-pSz03upnqPeHXe+LvDLADg@public.gmane.org>
Hi again,
I forget to cc linux-nfs mailing list.
Michal
P.S.: Tested on kernels 2.6.38-rc4, 2.6.37 and 2.6.36
Michal Simek wrote:
> Hi All,
>
> I am trying to understand one problem which we have found.
> The problem is that I can't on Microblaze little-endian platform
> mount nfs without -o nolock options. (Log below)
> Selecting tcp or udp has no effect.
> I am using emaclite driver and there is no problem on big endian
> microblaze.
>
> ping, telnet, http, ftp, iperf, netperf work well.
>
> That's why I have a question if there is any endian specific option for
> NFS?
>
> Thanks,
> Michal
>
> ~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
> svc: failed to register lockdv1 RPC service (errno 13).
> lockd_up: makesock failed, error=-13
> svc: failed to register lockdv1 RPC service (errno 13).
> ~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
> ~ # mount
> rootfs on / type rootfs (rw)
> proc on /proc type proc (rw,relatime)
> none on /var type ramfs (rw,relatime)
> none on /sys type sysfs (rw,relatime)
> none on /etc/config type ramfs (rw,relatime)
> none on /dev/pts type devpts (rw,relatime,mode=600)
> 192.168.0.101:/tftpboot/nfs on /mnt type nfs
> (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
>
> ~ #
> ~ # ps
> PID USER TIME COMMAND
> 1 root 0:02 init
> 2 root 0:00 [kthreadd]
> 3 root 0:00 [ksoftirqd/0]
> 4 root 0:00 [kworker/0:0]
> 5 root 0:00 [kworker/u:0]
> 6 root 0:00 [khelper]
> 7 root 0:00 [sync_supers]
> 8 root 0:00 [bdi-default]
> 9 root 0:00 [kblockd]
> 10 root 0:00 [rpciod]
> 11 root 0:00 [kworker/0:1]
> 12 root 0:00 [kswapd0]
> 13 root 0:00 [fsnotify_mark]
> 14 root 0:00 [aio]
> 15 root 0:00 [nfsiod]
> 16 root 0:00 [kworker/u:1]
> 58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
> 62 1 0:00 /bin/portmap
> 64 root 0:00 /bin/inetd /etc/inetd.conf
> 65 root 0:01 -sh
> 66 root 0:00 /bin/syslogd -n
> 67 root 0:00 /bin/flatfsd
> 68 root 0:00 [kworker/0:2]
> 91 root 0:00 ps
> ~ # cat /proc/cpuinfo
> CPU-Family: MicroBlaze
> FPGA-Arch: spartan6
> CPU-Ver: 8.00.a, little endian
> CPU-MHz: 50.00
> BogoMips: 24.06
> HW:
> Shift: yes
> MSR: yes
> PCMP: yes
> DIV: yes
> MMU: 3
> MUL: v2
> FPU: no
> Exc: op0x0 unal ill iopb dopb zero
> Icache: 16kB line length: 32B
> Dcache: 16kB line length: 16B
> write-through
> HW-Debug: yes
> PVR-USR1: 00
> PVR-USR2: 00000000
> Page size: 4096
> ~ #
>
>
>
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch net-next-2.6 1/4] rtnetlink: implement setting of master device
From: Stephen Hemminger @ 2011-02-16 13:18 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev, davem, shemminger, kaber, fubar, eric.dumazet,
nicolas.2p.debian
In-Reply-To: <20110213193105.GD2740@psychotron.redhat.com>
On Sun, 13 Feb 2011 20:31:06 +0100
Jiri Pirko <jpirko@redhat.com> wrote:
> This patch allows userspace to enslave/release slave devices via netlink
> interface using IFLA_MASTER. This introduces generic way to add/remove
> underling devices.
>
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
But, setting master means something different for each type of device?
What happens if you move eth0 from br0 to bond0?
The name "master" is only used in the bonding spec. It is not used in
description of bridges in the 802.1 spec. There are also some companies
that have very "politically correct" HR departments that think that any
reference to master or slave is racist.
^ permalink raw reply
* Re: [GIT PULL nf-next-2.6] IPVS
From: Patrick McHardy @ 2011-02-16 13:19 UTC (permalink / raw)
To: Simon Horman
Cc: lvs-devel, netdev, netfilter-devel, netfilter, Julian Anastasov,
Patrick Schaaf
In-Reply-To: <1297836293-5942-1-git-send-email-horms@verge.net.au>
On 16.02.2011 07:04, Simon Horman wrote:
> Hi Patrick,
>
> please consider pulling
> git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6.git master
> go get:
>
> * Removal of unused ICMP code by Julian
> * More informative "no destination available" messages
> by Patrick Schaaf
> * Fix to buffering of synchronisation messages
> by Tinggong Wang and Julian
>
Pulled, thanks Simon.
^ permalink raw reply
* Re: [patch net-next-2.6 4/4] bridge: implement [add/del]_slave ops
From: Stephen Hemminger @ 2011-02-16 13:21 UTC (permalink / raw)
To: David Miller
Cc: jpirko, netdev, shemminger, kaber, fubar, eric.dumazet,
nicolas.2p.debian
In-Reply-To: <20110213.165903.184824754.davem@davemloft.net>
On Sun, 13 Feb 2011 16:59:03 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Jiri Pirko <jpirko@redhat.com>
> Date: Sun, 13 Feb 2011 20:33:42 +0100
>
> > add possibility to addif/delif via rtnetlink
> >
> > Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
> Applied.
You should follow established protocol and wait until I have
had time to review code that impacts areas which I maintain.
The linux-foundation email address is not listed in MAINTAINERS
file and is mainly a spam catcher that I never read.
Maybe I should just start sending networking patches to Linus.
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: Patrick McHardy @ 2011-02-16 13:22 UTC (permalink / raw)
To: richard -rw- weinberger; +Cc: Phil Karn, netdev
In-Reply-To: <AANLkTinBOk8ZNQvRpMqZQE_vOu63QVzDZ4ceRRUDvJD_@mail.gmail.com>
On 16.02.2011 13:51, richard -rw- weinberger wrote:
> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>> The range check on vlan_id in register_vlan_device is off by one, and it
>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>> this, I checked.)
>
> Then OSX should fix their code. 4095 is reserved.
I agree.
^ permalink raw reply
* Re: NFS on little-endian platform - Microblaze
From: Trond Myklebust @ 2011-02-16 13:22 UTC (permalink / raw)
To: monstr; +Cc: netdev, David Miller, linux-nfs
In-Reply-To: <4D5BCE43.1090401@monstr.eu>
On Wed, 2011-02-16 at 14:16 +0100, Michal Simek wrote:
> Hi again,
>
> I forget to cc linux-nfs mailing list.
>
> Michal
>
> P.S.: Tested on kernels 2.6.38-rc4, 2.6.37 and 2.6.36
>
> Michal Simek wrote:
> > Hi All,
> >
> > I am trying to understand one problem which we have found.
> > The problem is that I can't on Microblaze little-endian platform
> > mount nfs without -o nolock options. (Log below)
> > Selecting tcp or udp has no effect.
> > I am using emaclite driver and there is no problem on big endian
> > microblaze.
> >
> > ping, telnet, http, ftp, iperf, netperf work well.
> >
> > That's why I have a question if there is any endian specific option for
> > NFS?
> >
> > Thanks,
> > Michal
> >
> > ~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
> > svc: failed to register lockdv1 RPC service (errno 13).
> > lockd_up: makesock failed, error=-13
> > svc: failed to register lockdv1 RPC service (errno 13).
> > ~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
> > ~ # mount
> > rootfs on / type rootfs (rw)
> > proc on /proc type proc (rw,relatime)
> > none on /var type ramfs (rw,relatime)
> > none on /sys type sysfs (rw,relatime)
> > none on /etc/config type ramfs (rw,relatime)
> > none on /dev/pts type devpts (rw,relatime,mode=600)
> > 192.168.0.101:/tftpboot/nfs on /mnt type nfs
> > (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
> >
> > ~ #
> > ~ # ps
> > PID USER TIME COMMAND
> > 1 root 0:02 init
> > 2 root 0:00 [kthreadd]
> > 3 root 0:00 [ksoftirqd/0]
> > 4 root 0:00 [kworker/0:0]
> > 5 root 0:00 [kworker/u:0]
> > 6 root 0:00 [khelper]
> > 7 root 0:00 [sync_supers]
> > 8 root 0:00 [bdi-default]
> > 9 root 0:00 [kblockd]
> > 10 root 0:00 [rpciod]
> > 11 root 0:00 [kworker/0:1]
> > 12 root 0:00 [kswapd0]
> > 13 root 0:00 [fsnotify_mark]
> > 14 root 0:00 [aio]
> > 15 root 0:00 [nfsiod]
> > 16 root 0:00 [kworker/u:1]
> > 58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
> > 62 1 0:00 /bin/portmap
> > 64 root 0:00 /bin/inetd /etc/inetd.conf
> > 65 root 0:01 -sh
> > 66 root 0:00 /bin/syslogd -n
> > 67 root 0:00 /bin/flatfsd
> > 68 root 0:00 [kworker/0:2]
> > 91 root 0:00 ps
Where is rpc.statd? Without it, the above behaviour is 100% expected.
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply
* [PATCH] sfc: lower stack usage in efx_ethtool_self_test
From: Eric Dumazet @ 2011-02-16 13:48 UTC (permalink / raw)
To: Ben Hutchings; +Cc: David Miller, netdev
In-Reply-To: <1297800733.2584.15.camel@bwh-desktop>
drivers/net/sfc/ethtool.c: In function ‘efx_ethtool_self_test’:
drivers/net/sfc/ethtool.c:613: warning: the frame size of 1200 bytes
is larger than 1024 bytes
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
drivers/net/sfc/ethtool.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 272cfe7..0836b37 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -569,9 +569,14 @@ static void efx_ethtool_self_test(struct net_device *net_dev,
struct ethtool_test *test, u64 *data)
{
struct efx_nic *efx = netdev_priv(net_dev);
- struct efx_self_tests efx_tests;
+ struct efx_self_tests *efx_tests;
int already_up;
- int rc;
+ int rc = -ENOMEM;
+
+ efx_tests = kzalloc(sizeof(*efx_tests), GFP_KERNEL);
+ if (!efx_tests)
+ goto fail;
+
ASSERT_RTNL();
if (efx->state != STATE_RUNNING) {
@@ -589,13 +594,11 @@ static void efx_ethtool_self_test(struct net_device *net_dev,
if (rc) {
netif_err(efx, drv, efx->net_dev,
"failed opening device.\n");
- goto fail2;
+ goto fail1;
}
}
- memset(&efx_tests, 0, sizeof(efx_tests));
-
- rc = efx_selftest(efx, &efx_tests, test->flags);
+ rc = efx_selftest(efx, efx_tests, test->flags);
if (!already_up)
dev_close(efx->net_dev);
@@ -604,10 +607,11 @@ static void efx_ethtool_self_test(struct net_device *net_dev,
rc == 0 ? "passed" : "failed",
(test->flags & ETH_TEST_FL_OFFLINE) ? "off" : "on");
- fail2:
- fail1:
+fail1:
/* Fill ethtool results structures */
- efx_ethtool_fill_self_tests(efx, &efx_tests, NULL, data);
+ efx_ethtool_fill_self_tests(efx, efx_tests, NULL, data);
+ kfree(efx_tests);
+fail:
if (rc)
test->flags |= ETH_TEST_FL_FAILED;
}
^ permalink raw reply related
* Re: NFS on little-endian platform - Microblaze
From: Michal Simek @ 2011-02-16 13:53 UTC (permalink / raw)
To: Trond Myklebust
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, David Miller,
linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1297862575.6596.0.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
Trond Myklebust wrote:
> On Wed, 2011-02-16 at 14:16 +0100, Michal Simek wrote:
>> Hi again,
>>
>> I forget to cc linux-nfs mailing list.
>>
>> Michal
>>
>> P.S.: Tested on kernels 2.6.38-rc4, 2.6.37 and 2.6.36
>>
>> Michal Simek wrote:
>>> Hi All,
>>>
>>> I am trying to understand one problem which we have found.
>>> The problem is that I can't on Microblaze little-endian platform
>>> mount nfs without -o nolock options. (Log below)
>>> Selecting tcp or udp has no effect.
>>> I am using emaclite driver and there is no problem on big endian
>>> microblaze.
>>>
>>> ping, telnet, http, ftp, iperf, netperf work well.
>>>
>>> That's why I have a question if there is any endian specific option for
>>> NFS?
>>>
>>> Thanks,
>>> Michal
>>>
>>> ~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
>>> svc: failed to register lockdv1 RPC service (errno 13).
>>> lockd_up: makesock failed, error=-13
>>> svc: failed to register lockdv1 RPC service (errno 13).
>>> ~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
>>> ~ # mount
>>> rootfs on / type rootfs (rw)
>>> proc on /proc type proc (rw,relatime)
>>> none on /var type ramfs (rw,relatime)
>>> none on /sys type sysfs (rw,relatime)
>>> none on /etc/config type ramfs (rw,relatime)
>>> none on /dev/pts type devpts (rw,relatime,mode=600)
>>> 192.168.0.101:/tftpboot/nfs on /mnt type nfs
>>> (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
>>>
>>> ~ #
>>> ~ # ps
>>> PID USER TIME COMMAND
>>> 1 root 0:02 init
>>> 2 root 0:00 [kthreadd]
>>> 3 root 0:00 [ksoftirqd/0]
>>> 4 root 0:00 [kworker/0:0]
>>> 5 root 0:00 [kworker/u:0]
>>> 6 root 0:00 [khelper]
>>> 7 root 0:00 [sync_supers]
>>> 8 root 0:00 [bdi-default]
>>> 9 root 0:00 [kblockd]
>>> 10 root 0:00 [rpciod]
>>> 11 root 0:00 [kworker/0:1]
>>> 12 root 0:00 [kswapd0]
>>> 13 root 0:00 [fsnotify_mark]
>>> 14 root 0:00 [aio]
>>> 15 root 0:00 [nfsiod]
>>> 16 root 0:00 [kworker/u:1]
>>> 58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
>>> 62 1 0:00 /bin/portmap
>>> 64 root 0:00 /bin/inetd /etc/inetd.conf
>>> 65 root 0:01 -sh
>>> 66 root 0:00 /bin/syslogd -n
>>> 67 root 0:00 /bin/flatfsd
>>> 68 root 0:00 [kworker/0:2]
>>> 91 root 0:00 ps
>
> Where is rpc.statd? Without it, the above behaviour is 100% expected.
I see on BE that lockd is used but it is enabled on little endian too but hasn't started.
Enabled options:
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
On Be lockd is up.
69 root 0:00 /bin/flatfsd
71 root 0:00 [lockd]
73 root 0:00 ps
I have to look why.
How is it started?
Michal
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: NFS on little-endian platform - Microblaze
From: Trond Myklebust @ 2011-02-16 14:04 UTC (permalink / raw)
To: monstr; +Cc: netdev, David Miller, linux-nfs
In-Reply-To: <4D5BD6E5.8010903@monstr.eu>
On Wed, 2011-02-16 at 14:53 +0100, Michal Simek wrote:
> Trond Myklebust wrote:
> > On Wed, 2011-02-16 at 14:16 +0100, Michal Simek wrote:
> >> Hi again,
> >>
> >> I forget to cc linux-nfs mailing list.
> >>
> >> Michal
> >>
> >> P.S.: Tested on kernels 2.6.38-rc4, 2.6.37 and 2.6.36
> >>
> >> Michal Simek wrote:
> >>> Hi All,
> >>>
> >>> I am trying to understand one problem which we have found.
> >>> The problem is that I can't on Microblaze little-endian platform
> >>> mount nfs without -o nolock options. (Log below)
> >>> Selecting tcp or udp has no effect.
> >>> I am using emaclite driver and there is no problem on big endian
> >>> microblaze.
> >>>
> >>> ping, telnet, http, ftp, iperf, netperf work well.
> >>>
> >>> That's why I have a question if there is any endian specific option for
> >>> NFS?
> >>>
> >>> Thanks,
> >>> Michal
> >>>
> >>> ~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
> >>> svc: failed to register lockdv1 RPC service (errno 13).
> >>> lockd_up: makesock failed, error=-13
> >>> svc: failed to register lockdv1 RPC service (errno 13).
> >>> ~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
> >>> ~ # mount
> >>> rootfs on / type rootfs (rw)
> >>> proc on /proc type proc (rw,relatime)
> >>> none on /var type ramfs (rw,relatime)
> >>> none on /sys type sysfs (rw,relatime)
> >>> none on /etc/config type ramfs (rw,relatime)
> >>> none on /dev/pts type devpts (rw,relatime,mode=600)
> >>> 192.168.0.101:/tftpboot/nfs on /mnt type nfs
> >>> (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
> >>>
> >>> ~ #
> >>> ~ # ps
> >>> PID USER TIME COMMAND
> >>> 1 root 0:02 init
> >>> 2 root 0:00 [kthreadd]
> >>> 3 root 0:00 [ksoftirqd/0]
> >>> 4 root 0:00 [kworker/0:0]
> >>> 5 root 0:00 [kworker/u:0]
> >>> 6 root 0:00 [khelper]
> >>> 7 root 0:00 [sync_supers]
> >>> 8 root 0:00 [bdi-default]
> >>> 9 root 0:00 [kblockd]
> >>> 10 root 0:00 [rpciod]
> >>> 11 root 0:00 [kworker/0:1]
> >>> 12 root 0:00 [kswapd0]
> >>> 13 root 0:00 [fsnotify_mark]
> >>> 14 root 0:00 [aio]
> >>> 15 root 0:00 [nfsiod]
> >>> 16 root 0:00 [kworker/u:1]
> >>> 58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
> >>> 62 1 0:00 /bin/portmap
> >>> 64 root 0:00 /bin/inetd /etc/inetd.conf
> >>> 65 root 0:01 -sh
> >>> 66 root 0:00 /bin/syslogd -n
> >>> 67 root 0:00 /bin/flatfsd
> >>> 68 root 0:00 [kworker/0:2]
> >>> 91 root 0:00 ps
> >
> > Where is rpc.statd? Without it, the above behaviour is 100% expected.
>
> I see on BE that lockd is used but it is enabled on little endian too but hasn't started.
>
> Enabled options:
> CONFIG_NETWORK_FILESYSTEMS=y
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> CONFIG_LOCKD=y
> CONFIG_LOCKD_V4=y
> CONFIG_NFS_COMMON=y
> CONFIG_SUNRPC=y
>
> On Be lockd is up.
> 69 root 0:00 /bin/flatfsd
> 71 root 0:00 [lockd]
> 73 root 0:00 ps
>
> I have to look why.
> How is it started?
Either rpc.bind or rpc.portmap and then rpc.statd need to be started
manually (in that order) before you may mount the NFS partition without
'-onolock'. The lockd daemon itself will be started by the kernel
whenever there is a need for it.
Please check your 'init' boot scripts to find out why they are not being
started as expected.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply
* Re: NFS on little-endian platform - Microblaze
From: Michal Simek @ 2011-02-16 14:16 UTC (permalink / raw)
To: Trond Myklebust
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, David Miller,
linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1297865074.6596.10.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
Trond Myklebust wrote:
> On Wed, 2011-02-16 at 14:53 +0100, Michal Simek wrote:
>> Trond Myklebust wrote:
>>> On Wed, 2011-02-16 at 14:16 +0100, Michal Simek wrote:
>>>> Hi again,
>>>>
>>>> I forget to cc linux-nfs mailing list.
>>>>
>>>> Michal
>>>>
>>>> P.S.: Tested on kernels 2.6.38-rc4, 2.6.37 and 2.6.36
>>>>
>>>> Michal Simek wrote:
>>>>> Hi All,
>>>>>
>>>>> I am trying to understand one problem which we have found.
>>>>> The problem is that I can't on Microblaze little-endian platform
>>>>> mount nfs without -o nolock options. (Log below)
>>>>> Selecting tcp or udp has no effect.
>>>>> I am using emaclite driver and there is no problem on big endian
>>>>> microblaze.
>>>>>
>>>>> ping, telnet, http, ftp, iperf, netperf work well.
>>>>>
>>>>> That's why I have a question if there is any endian specific option for
>>>>> NFS?
>>>>>
>>>>> Thanks,
>>>>> Michal
>>>>>
>>>>> ~ # mount -t nfs 192.168.0.101:/tftpboot/nfs /mnt
>>>>> svc: failed to register lockdv1 RPC service (errno 13).
>>>>> lockd_up: makesock failed, error=-13
>>>>> svc: failed to register lockdv1 RPC service (errno 13).
>>>>> ~ # mount -t nfs -o nolock 192.168.0.101:/tftpboot/nfs /mnt
>>>>> ~ # mount
>>>>> rootfs on / type rootfs (rw)
>>>>> proc on /proc type proc (rw,relatime)
>>>>> none on /var type ramfs (rw,relatime)
>>>>> none on /sys type sysfs (rw,relatime)
>>>>> none on /etc/config type ramfs (rw,relatime)
>>>>> none on /dev/pts type devpts (rw,relatime,mode=600)
>>>>> 192.168.0.101:/tftpboot/nfs on /mnt type nfs
>>>>> (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.101)
>>>>>
>>>>> ~ #
>>>>> ~ # ps
>>>>> PID USER TIME COMMAND
>>>>> 1 root 0:02 init
>>>>> 2 root 0:00 [kthreadd]
>>>>> 3 root 0:00 [ksoftirqd/0]
>>>>> 4 root 0:00 [kworker/0:0]
>>>>> 5 root 0:00 [kworker/u:0]
>>>>> 6 root 0:00 [khelper]
>>>>> 7 root 0:00 [sync_supers]
>>>>> 8 root 0:00 [bdi-default]
>>>>> 9 root 0:00 [kblockd]
>>>>> 10 root 0:00 [rpciod]
>>>>> 11 root 0:00 [kworker/0:1]
>>>>> 12 root 0:00 [kswapd0]
>>>>> 13 root 0:00 [fsnotify_mark]
>>>>> 14 root 0:00 [aio]
>>>>> 15 root 0:00 [nfsiod]
>>>>> 16 root 0:00 [kworker/u:1]
>>>>> 58 root 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0
>>>>> 62 1 0:00 /bin/portmap
>>>>> 64 root 0:00 /bin/inetd /etc/inetd.conf
>>>>> 65 root 0:01 -sh
>>>>> 66 root 0:00 /bin/syslogd -n
>>>>> 67 root 0:00 /bin/flatfsd
>>>>> 68 root 0:00 [kworker/0:2]
>>>>> 91 root 0:00 ps
>>> Where is rpc.statd? Without it, the above behaviour is 100% expected.
>> I see on BE that lockd is used but it is enabled on little endian too but hasn't started.
>>
>> Enabled options:
>> CONFIG_NETWORK_FILESYSTEMS=y
>> CONFIG_NFS_FS=y
>> CONFIG_NFS_V3=y
>> CONFIG_LOCKD=y
>> CONFIG_LOCKD_V4=y
>> CONFIG_NFS_COMMON=y
>> CONFIG_SUNRPC=y
>>
>> On Be lockd is up.
>> 69 root 0:00 /bin/flatfsd
>> 71 root 0:00 [lockd]
>> 73 root 0:00 ps
>>
>> I have to look why.
>> How is it started?
>
> Either rpc.bind or rpc.portmap and then rpc.statd need to be started
> manually (in that order) before you may mount the NFS partition without
> '-onolock'. The lockd daemon itself will be started by the kernel
> whenever there is a need for it.
On big-endian system is not any rpc* binary either that's why I think this is not a problem.
Only portmap is started on both systems. Nothing more. If you want I can send you cpio archive.
>
> Please check your 'init' boot scripts to find out why they are not being
> started as expected.
ok.
Michal
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch net-next-2.6 1/4] rtnetlink: implement setting of master device
From: Jiri Pirko @ 2011-02-16 14:39 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, davem, shemminger, kaber, fubar, eric.dumazet,
nicolas.2p.debian
In-Reply-To: <20110216051844.53f577d5@s6510>
Wed, Feb 16, 2011 at 02:18:44PM CET, shemminger@vyatta.com wrote:
>On Sun, 13 Feb 2011 20:31:06 +0100
>Jiri Pirko <jpirko@redhat.com> wrote:
>
>> This patch allows userspace to enslave/release slave devices via netlink
>> interface using IFLA_MASTER. This introduces generic way to add/remove
>> underling devices.
>>
>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
>But, setting master means something different for each type of device?
Why isn't correct to use master also for bridge? It had no meaning there.
>What happens if you move eth0 from br0 to bond0?
you mean by:
ip link set eth0 master br0
ip link set eth0 master bond0
It's first removed from bridge, then added into bond. No problem here.
>
>The name "master" is only used in the bonding spec. It is not used in
>description of bridges in the 802.1 spec. There are also some companies
>that have very "politically correct" HR departments that think that any
>reference to master or slave is racist.
>
>
^ permalink raw reply
* Re: [patch net-next-2.6 1/4] rtnetlink: implement setting of master device
From: Patrick McHardy @ 2011-02-16 15:25 UTC (permalink / raw)
To: Jiri Pirko
Cc: Stephen Hemminger, netdev, davem, shemminger, fubar, eric.dumazet,
nicolas.2p.debian
In-Reply-To: <20110216143923.GB5727@psychotron.brq.redhat.com>
On 16.02.2011 15:39, Jiri Pirko wrote:
> Wed, Feb 16, 2011 at 02:18:44PM CET, shemminger@vyatta.com wrote:
>> On Sun, 13 Feb 2011 20:31:06 +0100
>> Jiri Pirko <jpirko@redhat.com> wrote:
>>
>>> This patch allows userspace to enslave/release slave devices via netlink
>>> interface using IFLA_MASTER. This introduces generic way to add/remove
>>> underling devices.
>>>
>>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>>
>> But, setting master means something different for each type of device?
>
> Why isn't correct to use master also for bridge? It had no meaning there.
In fact the bridge netlink family uses IFLA_MASTER for exactly
the same purpose.
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: Phil Karn @ 2011-02-16 15:58 UTC (permalink / raw)
To: richard -rw- weinberger; +Cc: kaber, netdev
In-Reply-To: <AANLkTinBOk8ZNQvRpMqZQE_vOu63QVzDZ4ceRRUDvJD_@mail.gmail.com>
On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>> The range check on vlan_id in register_vlan_device is off by one, and it
>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>> this, I checked.)
>
> Then OSX should fix their code. 4095 is reserved.
>
If it's reserved, then it's up to the user to reserve it.
I actually had reason to use this to fix a misconfigured host that was
using vlan 4095. This got in my way.
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: richard -rw- weinberger @ 2011-02-16 16:10 UTC (permalink / raw)
To: Phil Karn; +Cc: kaber, netdev
In-Reply-To: <4D5BF411.4020204@ka9q.net>
On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>>> The range check on vlan_id in register_vlan_device is off by one, and it
>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>>> this, I checked.)
>>
>> Then OSX should fix their code. 4095 is reserved.
>>
>
> If it's reserved, then it's up to the user to reserve it.
No.
See:
http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
--
Thanks,
//richard
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: Phil Karn @ 2011-02-16 16:28 UTC (permalink / raw)
To: richard -rw- weinberger; +Cc: kaber, netdev
In-Reply-To: <AANLkTikNrwd31RBj1gc6kSaT=qodS=A=YntM=72PMbDf@mail.gmail.com>
On 2/16/11 8:10 AM, richard -rw- weinberger wrote:
> On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
>> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
>>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>>>> The range check on vlan_id in register_vlan_device is off by one, and it
>>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>>>> this, I checked.)
>>>
>>> Then OSX should fix their code. 4095 is reserved.
>>>
>>
>> If it's reserved, then it's up to the user to reserve it.
>
> No.
>
> See:
> http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
>
Well, then I guess we all know better than the user. That's the Windows
Way...no, wait, I thought this is Linux.
The fact is that I did encounter a misconfigured switch using vlan 4095,
and because of this off-by-one error I was unable to talk to it and fix it.
I was hoping I wouldn't have to patch every new kernel I install.
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: richard -rw- weinberger @ 2011-02-16 16:35 UTC (permalink / raw)
To: Phil Karn; +Cc: kaber, netdev
In-Reply-To: <4D5BFB39.8070805@ka9q.net>
On Wed, Feb 16, 2011 at 5:28 PM, Phil Karn <karn@ka9q.net> wrote:
> On 2/16/11 8:10 AM, richard -rw- weinberger wrote:
>> On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
>>> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
>>>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>>>>> The range check on vlan_id in register_vlan_device is off by one, and it
>>>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>>>>> this, I checked.)
>>>>
>>>> Then OSX should fix their code. 4095 is reserved.
>>>>
>>>
>>> If it's reserved, then it's up to the user to reserve it.
>>
>> No.
>>
>> See:
>> http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
>>
>
> Well, then I guess we all know better than the user. That's the Windows
> Way...no, wait, I thought this is Linux.
>
> The fact is that I did encounter a misconfigured switch using vlan 4095,
> and because of this off-by-one error I was unable to talk to it and fix it.
>
> I was hoping I wouldn't have to patch every new kernel I install.
>
The switch violates the standard. Why should Linux also do so?
This would only produce more broken VLANs...
--
Thanks,
//richard
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: Eric Dumazet @ 2011-02-16 16:39 UTC (permalink / raw)
To: Phil Karn; +Cc: richard -rw- weinberger, kaber, netdev
In-Reply-To: <4D5BFB39.8070805@ka9q.net>
Le mercredi 16 février 2011 à 08:28 -0800, Phil Karn a écrit :
> On 2/16/11 8:10 AM, richard -rw- weinberger wrote:
> > On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
> >> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
> >>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
> >>>> The range check on vlan_id in register_vlan_device is off by one, and it
> >>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
> >>>> this, I checked.)
> >>>
> >>> Then OSX should fix their code. 4095 is reserved.
> >>>
> >>
> >> If it's reserved, then it's up to the user to reserve it.
> >
> > No.
> >
> > See:
> > http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
> >
>
> Well, then I guess we all know better than the user. That's the Windows
> Way...no, wait, I thought this is Linux.
>
> The fact is that I did encounter a misconfigured switch using vlan 4095,
> and because of this off-by-one error I was unable to talk to it and fix it.
>
> I was hoping I wouldn't have to patch every new kernel I install.
>
You can use an OSX gateway ;)
If we allow ID 4095, then some users will complain we violate rules.
Really you cannot push this patch in official kernel only to ease your
life ;)
^ permalink raw reply
* RE: Process for subsystem maintainers to get Hyper-V code out of staging. - CORRECTED RECIPIENTS
From: Hank Janssen @ 2011-02-16 17:43 UTC (permalink / raw)
To: Robert Hancock
Cc: shemminger@linux-foundation.org, netdev@vger.kernel.org,
davem@davemloft.net, ide, KY Srinivasan, Hashir Abdi,
Mike Sterling, Haiyang Zhang, gregkh@suse.de" "
In-Reply-To: <4D59CCAD.90503@gmail.com>
> From: Robert Hancock [mailto:hancockrwd@gmail.com]
> Sent: Monday, February 14, 2011 4:46 PM
> On 02/14/2011 05:42 PM, Hank Janssen wrote:
> >
> > MY APOLOGIES-I made a typo on James email address. I corrected it and
> resend.
> > Sorry for the double email.
> >
> >
> > Stephen/James/David,
> >
> > Greetings to you all. As you might be aware, we submitted Hyper-V
> drivers to the kernel 2009.
> > We have been extending these drivers with additional functionality
> and our primary focus now
> > is doing the work needed to exit the staging area.
> >
> > To give you some background, the following are Hyper-V specific Linux
> drivers:
> >
> > hv_vmbus The vmbus driver that is the
> bridge between guest and the
> > host
> > hv_storvsc The SCSI device driver
> > hv_blkvsc The IDE driver
>
> Given that the IDE subsystem (drivers/ide) is currently in
> maintenance-only mode, and isn't used by modern distributions, you
> likely want to make this a libata driver instead.
>
> Though, from what's in current git, it's not clear to me what the HV
> IDE
> (and SCSI) drivers are attempting to do. Is it really something that
> looks like an IDE controller from the guest OS point of view? If not,
> then having it as an IDE driver would be the wrong thing to do, it
> should be more of a generic block driver. In that case, then, why are
> there both SCSI and IDE drivers in the first place?
>
Robert,
Thank you very much for your responses, today Hyper-V host only supports
IDE and SCSI, and the code was initially written against 2.6.9 kernel.
Hyper-V still treats them a separate interface and is designed to emulate
A pretty old BIOS.
What my approach will be is to dig into libsata (something I have not
Much knowledge of) and see if we can use it and find a way to more sanely
Merge the behavior of Hyper-V's IDE and SCSI.
Hank.
^ permalink raw reply
* Re: [RFC !!BONUS!! PATCH 6/5] ipv4: Delete routing cache.
From: David Miller @ 2011-02-16 18:09 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1297842977.3201.7.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 16 Feb 2011 08:56:17 +0100
> Le mardi 15 février 2011 à 18:55 -0800, David Miller a écrit :
>> From: David Miller <davem@davemloft.net>
>> Date: Wed, 09 Feb 2011 22:39:39 -0800 (PST)
>>
>> >
>> > Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>> Ok, this patch had one nasty bug:
>>
>> > + if (!err == 0)
>>
>> Yeah... right.
>>
>> I'm actively testing this version at the moment, against net-next-2.6,
>> works fine thus far.
>>
>> --------------------
>> ipv4: Delete routing cache.
>>
>> Signed-off-by: David S. Miller <davem@davemloft.net>
>> ---
>
> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> I suspect we can zap DST_NOCACHE later ?
Yes, the number of cleanups we can do after this patch is actually
quite large.
^ permalink raw reply
* Re: Off-by-one error in net/8021q/vlan.c
From: Michał Mirosław @ 2011-02-16 18:41 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Phil Karn, richard -rw- weinberger, kaber, netdev
In-Reply-To: <1297874372.30541.29.camel@edumazet-laptop>
2011/2/16 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 16 février 2011 à 08:28 -0800, Phil Karn a écrit :
>> On 2/16/11 8:10 AM, richard -rw- weinberger wrote:
>> > On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
>> >> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
>> >>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
>> >>>> The range check on vlan_id in register_vlan_device is off by one, and it
>> >>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
>> >>>> this, I checked.)
>> >>>
>> >>> Then OSX should fix their code. 4095 is reserved.
>> >> If it's reserved, then it's up to the user to reserve it.
>> > No.
>> > See:
>> > http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
>> Well, then I guess we all know better than the user. That's the Windows
>> Way...no, wait, I thought this is Linux.
>>
>> The fact is that I did encounter a misconfigured switch using vlan 4095,
>> and because of this off-by-one error I was unable to talk to it and fix it.
>>
>> I was hoping I wouldn't have to patch every new kernel I install.
> You can use an OSX gateway ;)
>
> If we allow ID 4095, then some users will complain we violate rules.
>
> Really you cannot push this patch in official kernel only to ease your
> life ;)
The idea is that you don't have to use ID 4095 and if you don't -
nothing's broken by just allowing it. The same goes with ID 0 - it's
defined to be 802.1p packet, but people do use it as normal VLAN
(especially with hardware that can cope with only small number of
VLANs at once).
Allowing it but with a big fat warning in logs is even better: "You
want your network broken? Sure, can do, but you have been warned."
Best Regards,
Michał Mirosław
^ permalink raw reply
* [PATCH] bonding: added 802.3ad round-robin hashing policy and source mac selection mode
From: Oleg V. Ukhno @ 2011-02-16 19:13 UTC (permalink / raw)
To: netdev; +Cc: Jay Vosburgh, David S. Miller
Patch introduces two new (related) features to bonding module.
First feature is round-robin hashing policy, which is primarily
intended for use with 802.3ad mode, and puts every next IPv4 and
IPv6 packet into next availables slave without taling into account
which layer3 and above protocol is used.
Second feature makes possible choosing which MAC-address will be set
in the transmitted packet - when set to src-mac it will force setting
slave's interface real MAC address as source MAC address in every
packet, sent via this slave interface.
Main goal of this patch is to make possible single TCP stream
equally striped for both transmitted and received packets over all
available slaves.
This operating mode is not fully 802.3ad compliant, and will cause
some packet reordering in TCP stream, to some kernel tuning may be
required.
For correct working enabling round-robin hashing policy plus using
real slave's MAC addresses as source MAC addresses in transmitted
packets requires specific switch setting)hashing mode for port-channel
("etherchannel) should be set to src-mac or src-dst-mac to get
correct load-striping on the receiving host's etherchannel.
General requirements for using bonding in this operating mode are:
- even and preferrably equal number of slaves on sending and receiving
hosts;
- equal RTT between sending and receiving hosts on all slaves;
- switch capable of doing etherchannels and using src-mac or src-dst-mac
hashing policy for egress load striping
Signed-off-by: Oleg V. Ukhno <olegu@yandex-team.ru>
---
Documentation/networking/bonding.txt | 65 +++++++++++++++++++++++++++++++++++
drivers/net/bonding/bond_3ad.c | 2 -
drivers/net/bonding/bond_main.c | 60 +++++++++++++++++++++++++++++---
drivers/net/bonding/bond_sysfs.c | 50 ++++++++++++++++++++++++++
drivers/net/bonding/bonding.h | 7 +++
include/linux/if_bonding.h | 1
6 files changed, 178 insertions(+), 7 deletions(-)
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/Documentation/networking/bonding.txt linux-2.6.p/Documentation/networking/bonding.txt
--- linux-2.6/Documentation/networking/bonding.txt 2011-02-08 16:03:01.290281998 +0300
+++ linux-2.6.p/Documentation/networking/bonding.txt 2011-02-16 22:03:09.650281997 +0300
@@ -83,6 +83,7 @@ Table of Contents
12. Configuring Bonding for Maximum Throughput
12.1 Maximum Throughput in a Single Switch Topology
12.1.1 MT Bonding Mode Selection for Single Switch Topology
+12.1.1.1 Maximizing TCP Throughput for RX/TX for Single Switch Topology using layer2 mechanisms
12.1.2 MT Link Monitoring for Single Switch Topology
12.2 Maximum Throughput in a Multiple Switch Topology
12.2.1 MT Bonding Mode Selection for Multiple Switch Topology
@@ -761,6 +762,34 @@ xmit_hash_policy
conversations. Other implementations of 802.3ad may
or may not tolerate this noncompliance.
+ round-robin
+
+ This policy simply puts every next packet into next
+ slave interfaces, providing round-robin load striping
+ for transmitted data. This policy can be enabled with
+ any mode which supports choosing alternate hash policy,
+ but was initially done for 802.3ad mode.
+
+ Main goal for this policy is to stripe TX load without
+ taking into account which layer3 protocol is used, and
+ can be used for single TCP connection load striping. When
+ enabled, it will round-robin packets for IPv4 and IPv6
+ only.
+
+ There is also src_mac_select option, which can be used
+ to configure RX load-striping using switch hashing
+ algorhytms on the receiving side. See detailed description
+ below.
+
+ It is important to understand, that this hashing policy
+ will possibly cause TCP out-of-order packets when enabled
+ and must not be used when slaves have different bandwidth
+ and/or RTT in receiver's direction. This algorithm is not
+ fully 802.3ad compliant. Some implementations of 802.3ad
+ may or may not tolerate this noncompliance.
+
+ Hashing formula is transmitted packet number % slave count.
+
The default value is layer2. This option was added in bonding
version 2.6.3. In earlier versions of bonding, this parameter
does not exist, and the layer2 policy is the only policy. The
@@ -2190,6 +2219,42 @@ balance-alb: This mode is everything tha
device driver must support changing the hardware address while
the device is open.
+12.1.1.1 Maximizing TCP Throughput for RX/TX for Single Switch Topology
+ using layer2 mechanisms
+----------------------------------------------------------------------
+ Besides of methods of load striping and configuring HA, mentioned
+above, you can use round-robin hashing policy and src_mac_select "slave-src"
+setting to stripe TCP load near-equally over even number of slaves. Please
+note, that enabling round-robin policy for balance-xor mode should turn it
+into mode similar to balance-rr mode.
+ There is also specific switch configuration required to use all
+benefits of both round-robin hashing policy and src_mac_select "slave-src"
+setting.
+ When you enable round-robin xmit hashing policy plus set
+src_mac_select to slave-src mode, you will get every next packet
+transmitted over a new slave with every's packet source MAC address set
+to real MAC address of the according slave interface, not the aggregate
+interface.
+ Imagine, that you have two hosts(let's say A and B), each connected
+using 2 slave interfaces to switch with appropriate port-channels configured
+("etherchannels"). After you start transmitting TCP data from A to B, and
+round-robin hashing policy is enabled, you will see that TX load is equally
+striped over host A slaves, but all this traffic is received with only one
+machine's B slave.
+ Now, you set src_mac_select parameter to "slave-src" and
+configure switch for src-mac hashing for "outqoing" etherchannel load
+striping. Now every packet sent from host A has slave's MAC as source MAC
+address, and switch will send every packet from host A into receiving
+port-channel of host B taking into account source MAC address of packet being
+put into, so you will get near-equal RX load striping, which does not depend
+on layer3 and above protocols used for data transmission.
+ It is important to understand, that this load striping mode
+will only work correctly if number of slaves on each side is at least
+even, and preferrably equal and even.
+ This load striping mode also can cause TCP out-of-order packets,
+so you may need to tune your kernel for handling increased number of
+reordered packets.
+
12.1.2 MT Link Monitoring for Single Switch Topology
----------------------------------------------------
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/drivers/net/bonding/bond_3ad.c linux-2.6.p/drivers/net/bonding/bond_3ad.c
--- linux-2.6/drivers/net/bonding/bond_3ad.c 2011-02-16 00:59:18.710282002 +0300
+++ linux-2.6.p/drivers/net/bonding/bond_3ad.c 2011-02-16 01:30:47.770281998 +0300
@@ -2419,7 +2419,7 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
goto out;
}
- slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
+ slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg, bond->rr_tx_counter++);
bond_for_each_slave(bond, slave, i) {
struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/drivers/net/bonding/bonding.h linux-2.6.p/drivers/net/bonding/bonding.h
--- linux-2.6/drivers/net/bonding/bonding.h 2011-02-16 00:59:18.720282002 +0300
+++ linux-2.6.p/drivers/net/bonding/bonding.h 2011-02-16 01:33:11.610282004 +0300
@@ -162,6 +162,7 @@ struct bond_params {
int tx_queues;
int all_slaves_active;
int resend_igmp;
+ int src_mac_select;
};
struct bond_parm_tbl {
@@ -235,7 +236,7 @@ struct bonding {
#endif /* CONFIG_PROC_FS */
struct list_head bond_list;
struct netdev_hw_addr_list mc_list;
- int (*xmit_hash_policy)(struct sk_buff *, int);
+ int (*xmit_hash_policy)(struct sk_buff *, int, int);
__be32 master_ip;
u16 flags;
u16 rr_tx_counter;
@@ -308,6 +309,9 @@ static inline bool bond_is_lb(const stru
#define BOND_ARP_VALIDATE_ALL (BOND_ARP_VALIDATE_ACTIVE | \
BOND_ARP_VALIDATE_BACKUP)
+#define BOND_MAC_SRC_DEFAULT 0
+#define BOND_MAC_SRC_SLAVE 1
+
static inline int slave_do_arp_validate(struct bonding *bond,
struct slave *slave)
{
@@ -402,6 +406,7 @@ extern const struct bond_parm_tbl arp_va
extern const struct bond_parm_tbl fail_over_mac_tbl[];
extern const struct bond_parm_tbl pri_reselect_tbl[];
extern struct bond_parm_tbl ad_select_tbl[];
+extern const struct bond_parm_tbl src_mac_select_tbl[];
#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
void bond_send_unsolicited_na(struct bonding *bond);
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/drivers/net/bonding/bond_main.c linux-2.6.p/drivers/net/bonding/bond_main.c
--- linux-2.6/drivers/net/bonding/bond_main.c 2011-02-16 00:59:18.720282002 +0300
+++ linux-2.6.p/drivers/net/bonding/bond_main.c 2011-02-16 22:08:22.650281997 +0300
@@ -111,6 +111,7 @@ static char *fail_over_mac;
static int all_slaves_active = 0;
static struct bond_params bonding_defaults;
static int resend_igmp = BOND_DEFAULT_RESEND_IGMP;
+static char *src_mac_select;
module_param(max_bonds, int, 0);
MODULE_PARM_DESC(max_bonds, "Max number of bonded devices");
@@ -152,7 +153,7 @@ module_param(ad_select, charp, 0);
MODULE_PARM_DESC(ad_select, "803.ad aggregation selection logic: stable (0, default), bandwidth (1), count (2)");
module_param(xmit_hash_policy, charp, 0);
MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method: 0 for layer 2 (default)"
- ", 1 for layer 3+4");
+ ", 1 for layer 3+4, 3 for round-robin");
module_param(arp_interval, int, 0);
MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds");
module_param_array(arp_ip_target, charp, NULL, 0);
@@ -167,6 +168,9 @@ MODULE_PARM_DESC(all_slaves_active, "Kee
"0 for never (default), 1 for always.");
module_param(resend_igmp, int, 0);
MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link failure");
+module_param(src_mac_select, charp, 0);
+MODULE_PARM_DESC(src_mac_select, "Source MAC selection mode: 0 or default (default),"
+ "1 or slave-src to use slave's MAC as packet's src MAC");
/*----------------------------- Global variables ----------------------------*/
@@ -206,6 +210,7 @@ const struct bond_parm_tbl xmit_hashtype
{ "layer2", BOND_XMIT_POLICY_LAYER2},
{ "layer3+4", BOND_XMIT_POLICY_LAYER34},
{ "layer2+3", BOND_XMIT_POLICY_LAYER23},
+{ "round-robin", BOND_XMIT_POLICY_LAYERRR},
{ NULL, -1},
};
@@ -238,6 +243,12 @@ struct bond_parm_tbl ad_select_tbl[] = {
{ NULL, -1},
};
+const struct bond_parm_tbl src_mac_select_tbl[] = {
+{ "default", BOND_MAC_SRC_DEFAULT},
+{ "slave-src", BOND_MAC_SRC_SLAVE},
+{ NULL, -1},
+};
+
/*-------------------------- Forward declarations ---------------------------*/
static void bond_send_gratuitous_arp(struct bonding *bond);
@@ -422,6 +433,7 @@ struct vlan_entry *bond_next_vlan(struct
int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
struct net_device *slave_dev)
{
+ struct ethhdr *eth_data;
skb->dev = slave_dev;
skb->priority = 1;
#ifdef CONFIG_NET_POLL_CONTROLLER
@@ -433,6 +445,15 @@ int bond_dev_queue_xmit(struct bonding *
slave_dev->priv_flags &= ~IFF_IN_NETPOLL;
} else
#endif
+ if (bond->params.src_mac_select == BOND_MAC_SRC_SLAVE &&
+ (skb->protocol == htons(ETH_P_IP) ||
+ skb->protocol == htons(ETH_P_IPV6))) {
+ skb_reset_mac_header(skb);
+ eth_data = eth_hdr(skb);
+ memcpy(eth_data->h_source, slave_dev->perm_addr,
+ ETH_ALEN);
+ }
+
dev_queue_xmit(skb);
return 0;
@@ -3261,6 +3282,13 @@ static void bond_info_show_master(struct
bond->params.xmit_policy);
}
+ if (bond->params.src_mac_select == BOND_MAC_SRC_DEFAULT ||
+ bond->params.src_mac_select == BOND_MAC_SRC_DEFAULT) {
+ seq_printf(seq, "Source MAC select is: %s (%d)\n",
+ src_mac_select_tbl[bond->params.src_mac_select].modename,
+ bond->params.src_mac_select);
+ }
+
if (USES_PRIMARY(bond->params.mode)) {
seq_printf(seq, "Primary Slave: %s",
(bond->primary_slave) ?
@@ -3717,7 +3745,8 @@ void bond_unregister_arp(struct bonding
* Hash for the output device based upon layer 2 and layer 3 data. If
* the packet is not IP mimic bond_xmit_hash_policy_l2()
*/
-static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
+static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count,
+ int pktcount)
{
struct ethhdr *data = (struct ethhdr *)skb->data;
struct iphdr *iph = ip_hdr(skb);
@@ -3735,7 +3764,8 @@ static int bond_xmit_hash_policy_l23(str
* the packet is a frag or not TCP or UDP, just use layer 3 data. If it is
* altogether not IP, mimic bond_xmit_hash_policy_l2()
*/
-static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
+static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count,
+ int pktcount)
{
struct ethhdr *data = (struct ethhdr *)skb->data;
struct iphdr *iph = ip_hdr(skb);
@@ -3759,13 +3789,29 @@ static int bond_xmit_hash_policy_l34(str
/*
* Hash for the output device based upon layer 2 data
*/
-static int bond_xmit_hash_policy_l2(struct sk_buff *skb, int count)
+static int bond_xmit_hash_policy_l2(struct sk_buff *skb, int count,
+ int pktcount)
{
struct ethhdr *data = (struct ethhdr *)skb->data;
return (data->h_dest[5] ^ data->h_source[5]) % count;
}
+/*
+ * Round-robin over all active slaves(one packet per slave) for IP and IPv6,
+ * otherwise mimic bond_xmit_hash_policy_l2()
+ */
+static int bond_xmit_hash_policy_rr(struct sk_buff *skb, int count,
+ int pktcount)
+{
+ struct ethhdr *data = (struct ethhdr *)skb->data;
+ if (skb->protocol == htons(ETH_P_IP)
+ || skb->protocol == htons(ETH_P_IPV6)) {
+ return pktcount % count;
+ }
+ return (data->h_dest[5] ^ data->h_source[5]) % count;
+}
+
/*-------------------------- Device entry points ----------------------------*/
static int bond_open(struct net_device *bond_dev)
@@ -4395,7 +4441,8 @@ static int bond_xmit_xor(struct sk_buff
if (!BOND_IS_OK(bond))
goto out;
- slave_no = bond->xmit_hash_policy(skb, bond->slave_cnt);
+ slave_no = bond->xmit_hash_policy(skb, bond->slave_cnt,
+ bond->rr_tx_counter++);
bond_for_each_slave(bond, slave, i) {
slave_no--;
@@ -4492,6 +4539,9 @@ static void bond_set_xmit_hash_policy(st
case BOND_XMIT_POLICY_LAYER34:
bond->xmit_hash_policy = bond_xmit_hash_policy_l34;
break;
+ case BOND_XMIT_POLICY_LAYERRR:
+ bond->xmit_hash_policy = bond_xmit_hash_policy_rr;
+ break;
case BOND_XMIT_POLICY_LAYER2:
default:
bond->xmit_hash_policy = bond_xmit_hash_policy_l2;
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/drivers/net/bonding/bond_sysfs.c linux-2.6.p/drivers/net/bonding/bond_sysfs.c
--- linux-2.6/drivers/net/bonding/bond_sysfs.c 2011-02-08 16:03:02.950282003 +0300
+++ linux-2.6.p/drivers/net/bonding/bond_sysfs.c 2011-02-16 02:05:58.650281999 +0300
@@ -1643,6 +1643,55 @@ out:
static DEVICE_ATTR(resend_igmp, S_IRUGO | S_IWUSR,
bonding_show_resend_igmp, bonding_store_resend_igmp);
+/*
+ * Show and set the bonding src_mac_select param.
+ */
+
+static ssize_t bonding_show_src_mac_select(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct bonding *bond = to_bond(d);
+
+ return sprintf(buf, "%s %d\n",
+ src_mac_select_tbl[bond->params.src_mac_select].modename,
+ bond->params.src_mac_select);
+}
+
+static ssize_t bonding_store_src_mac_select(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int new_value, ret = count;
+ struct bonding *bond = to_bond(d);
+
+ if (bond->dev->flags & IFF_UP) {
+ pr_err("%s: Interface is up. Unable to update src mac select policy.\n",
+ bond->dev->name);
+ ret = -EPERM;
+ goto out;
+ }
+
+ new_value = bond_parse_parm(buf, src_mac_select_tbl);
+ if (new_value < 0) {
+ pr_err("%s: Ignoring invalid src mac select policy value %.*s.\n",
+ bond->dev->name,
+ (int)strlen(buf) - 1, buf);
+ ret = -EINVAL;
+ goto out;
+ } else {
+ bond->params.src_mac_select = new_value;
+ pr_info("%s: setting src mac select policy to %s (%d).\n",
+ bond->dev->name,
+ src_mac_select_tbl[new_value].modename, new_value);
+ }
+out:
+ return ret;
+}
+
+static DEVICE_ATTR(src_mac_select, S_IRUGO | S_IWUSR,
+ bonding_show_src_mac_select, bonding_store_src_mac_select);
+
static struct attribute *per_bond_attrs[] = {
&dev_attr_slaves.attr,
&dev_attr_mode.attr,
@@ -1671,6 +1720,7 @@ static struct attribute *per_bond_attrs[
&dev_attr_queue_id.attr,
&dev_attr_all_slaves_active.attr,
&dev_attr_resend_igmp.attr,
+ &dev_attr_src_mac_select.attr,
NULL,
};
diff -uprN -X linux-2.6/Documentation/dontdiff linux-2.6/include/linux/if_bonding.h linux-2.6.p/include/linux/if_bonding.h
--- linux-2.6/include/linux/if_bonding.h 2011-02-16 00:59:18.720282002 +0300
+++ linux-2.6.p/include/linux/if_bonding.h 2011-02-16 01:23:38.660282000 +0300
@@ -91,6 +91,7 @@
#define BOND_XMIT_POLICY_LAYER2 0 /* layer 2 (MAC only), default */
#define BOND_XMIT_POLICY_LAYER34 1 /* layer 3+4 (IP ^ (TCP || UDP)) */
#define BOND_XMIT_POLICY_LAYER23 2 /* layer 2+3 (IP ^ MAC) */
+#define BOND_XMIT_POLICY_LAYERRR 3 /* round-robin mode */
typedef struct ifbond {
__s32 bond_mode;
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox