* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Nicolas Dichtel @ 2012-12-12 20:54 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: netdev, davem, aatteka
In-Reply-To: <87fw3boyxn.fsf@xmission.com>
Le 12/12/2012 20:25, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> The goal of this serie is to ease netns management by daemons. Some systems use
>> netns only to virtualize network stack and don't want to multiply userland
>> daemons. These system may have a lot of netns, up to 2000. We don't want to
>> launch an instance of each daemons (quagga, strongswan, conntrackd, ...) for
>> each netns because it will consume a lot of ressources. Having one daemon that
>> manage all netns is more efficient (mainly if there are few objects to manage:
>> one or two routes per netns for example).
>> Hence, one goal of this serie is to allow, for a daemon, to monitor netns
>> activities, thus it can open or close netlink sockets, allocating structures
>> needed to manage these netns when they are created or deleted.
>> To help to identify a netns, an index has been added to each netns.
>>
>> A new setsockopt() option is also added, to help daemons to open socket in the
>> right netns. For now, a daemon that want to open a socket in a specified netns,
>> need to call setns(CLONE_NEWNET) with a fd (not so easy to found), open the
>> socket and then call again setns() to go back in the initial netns. Having this
>> kind of setsockopt() will simplify operations. Obviously, this setsockopt()
>> should be done enough early (is test on sk_state enough?). The first target is
>> netlink socket but it can be useful for other kind of socket, it's why a add a
>> generic socket option.
>>
>> As usual, the patch against iproute2 will be sent once the patches are included
>> and net-next merged. I can send it on demand.
>
> Short answer you don't need to do any of this.
>
> setns with the namespace files in /proc/<pid>/ns/net gives you more than
> enough mechanism to solve this problem. And iprout2 already supports
> all of this.
>
> And your approach creates very serious maintenances problems to the
> point I don't even want to read your patches. What namespace do your
> namespace id's live in?
>
> A socketopt to change the namespace of a socket is nasty because sockets
> changing which network namespace they are in, leads to races which
> aren't worth thinking about writing the code to handle.
>
> Longer answer.
>
> You can bind mount the namespace id's /proc/<pid>/ns/net files to
> give you any name you want. This puts naming policy in userspace
> control, and nests just fine.
>
> You can open a socket in any network namespace you want just
> by calling setns before socket. Wrapping this idiom in a library call
> or if there is sufficient need in a socketat system call seems
> reasonable.
Yes, I agree that this SO_NETNS may be a bad idea.
>
> There is a classic question of if two network namespace files refer to
> the same network namespace and I have code in linux-next and my pull
> request to Linus to give those files a unique inode number.
Interesseting to know that.
>
> So please use the facilities already merged into the kernel.
Ok, but how can a daemon get the list of netns? Suppose that we want that
quagga manage all netns, how can it get this list to open needed netlink
socket?
For example, iproute2 is only aware of netns created with iproute2, but it
will no detect other netns.
^ permalink raw reply
* Re: netconsole fun
From: Peter Hurley @ 2012-12-12 20:59 UTC (permalink / raw)
To: Neil Horman; +Cc: Cong Wang, netdev
In-Reply-To: <20121211164526.GB7481@neilslaptop.think-freely.org>
On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > device?
> > > > > >
> > > > >
> > > > > Yes, running it on the master device instead.
> > > >
> > > > Thanks for the suggestion, but:
> > > >
> > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > ...
> > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > [ 5.289929] netconsole: cleaning up
> > > > ...
> > > > [ 9.392291] Bridge firewalling registered
> > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > [ 9.418350] eth1: setting full-duplex.
> > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > >
> > > >
> > > > Is there a way to control or associate network device names prior to
> > > > udev renaming?
> > > >
> > > That looks like a systemd problem (or more specifically a boot dependency
> > > problem). You need to modify your netconsole unit/service file to start after
> > > all your networking is up. NetworkManager provides a dummy service file for
> > > this purpose, called networkmanager-wait-online.service
> >
> > Ok. So with a single physical network interface that will be bridged,
> > netconsole cannot used for kernel boot messages.
> >
> > With a machine with multiple nics, is there a way to control device
> > naming so that the interface name to be used by netconsole specified on
> > the boot command line will actually corresponding to the intended
> > device. For example,
> >
> > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > ....
> > [ 4.092184] 3c59x: Donald Becker and others.
> > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > ....
> > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> >
> > This is attaching netconsole to the wrong device because bus
> > enumeration, and therefore load order, is not consistent from boot to
> > boot.
> >
> No, theres no way to do that. As you note device ennumeration isn't consistent
> accross boots, thats why udev creates rules to rename devices based on immutable
> (or semi-immutable) data, like mac addresses, or pci bus locations). Once that
> happens, you'll have consistent names for your interfaces, and that work will be
> guaranteed to be done after networkmanager has finished opening all the
> interfaces that it needs (hence my suggestion to make netconsole service
> dependent on networkmanager service startup completing).
Just wondering if you think something like the patch below is
suitable/acceptable for insulating netconsole from inconsistent device
name scenarios without changing the existing semantics. The basic idea
is to allow an ethernet MAC address in the <dev> field of the
netconsole= options, and if a MAC address was specified rather than a
device name, to do the dev lookup from the MAC address instead.
This doesn't extend to, but also doesn't interfere with, the dynamic
config of netconsole via configfs.
Would you mind reviewing it?
Regards,
Peter
-- >% --
Subject: [PATCH] netconsole: allow mac addr to specify local interface device
Allow the local interface device to be specified by ethernet
MAC address. For example,
netconsole=@10.0.0.1/12:34:56:78:9a:bc,30000@10.0.0.3/cb:a9:87:65:43:21
This alternate form enables netconsole to start and log boot messages
even if the network device name varies (eg., a machine with multiple NICs).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
---
Documentation/networking/netconsole.txt | 9 +++++++--
drivers/net/netconsole.c | 2 ++
include/linux/netpoll.h | 1 +
net/core/netpoll.c | 19 +++++++++++++++++--
4 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/Documentation/networking/netconsole.txt b/Documentation/networking/netconsole.txt
index 2e9e0ae2..2dfd703 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -23,12 +23,13 @@ Sender and receiver configuration:
It takes a string configuration parameter "netconsole" in the
following format:
- netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
+ netconsole=[src-port]@[src-ip]/[dev|macaddr],[tgt-port]@<tgt-ip>/[tgt-macaddr]
where
src-port source for UDP packets (defaults to 6665)
src-ip source IP to use (interface address)
- dev network interface (eth0)
+ dev|macaddr network interface (eth0)
+ alternate: ethernet MAC address of network interface
tgt-port port for logging agent (6666)
tgt-ip IP address for logging agent
tgt-macaddr ethernet MAC address for logging agent (broadcast)
@@ -47,6 +48,10 @@ complete string enclosed in "quotes", thusly:
modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
+The alternate form for specifying the local network interface with the
+ethernet MAC address is useful when the device names are inconsistent from
+boot to boot (eg., if the machine has multiple NICs).
+
Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 6989ebe..3808a31 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -180,6 +180,7 @@ static struct netconsole_target *alloc_param_target(char *target_config)
strlcpy(nt->np.dev_name, "eth0", IFNAMSIZ);
nt->np.local_port = 6665;
nt->np.remote_port = 6666;
+ memset(nt->np.local_mac, 0, ETH_ALEN);
memset(nt->np.remote_mac, 0xff, ETH_ALEN);
/* Parse parameters and setup netpoll */
@@ -560,6 +561,7 @@ static struct config_item *make_netconsole_target(struct config_group *group,
strlcpy(nt->np.dev_name, "eth0", IFNAMSIZ);
nt->np.local_port = 6665;
nt->np.remote_port = 6666;
+ memset(nt->np.local_mac, 0, ETH_ALEN);
memset(nt->np.remote_mac, 0xff, ETH_ALEN);
/* Initialize the config_item member */
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 66d5379..d646b26 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -20,6 +20,7 @@ struct netpoll {
__be32 local_ip, remote_ip;
u16 local_port, remote_port;
+ u8 local_mac[ETH_ALEN];
u8 remote_mac[ETH_ALEN];
struct list_head rx; /* rx_np list element */
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 77a0388..8910a95 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -660,6 +660,7 @@ void netpoll_print_options(struct netpoll *np)
np_info(np, "local port %d\n", np->local_port);
np_info(np, "local IP %pI4\n", &np->local_ip);
np_info(np, "interface '%s'\n", np->dev_name);
+ np_info(np, "local ethernet address %pM\n", np->local_mac);
np_info(np, "remote port %d\n", np->remote_port);
np_info(np, "remote IP %pI4\n", &np->remote_ip);
np_info(np, "remote ethernet address %pM\n", np->remote_mac);
@@ -693,7 +694,8 @@ int netpoll_parse_options(struct netpoll *np, char *opt)
if ((delim = strchr(cur, ',')) == NULL)
goto parse_failed;
*delim = 0;
- strlcpy(np->dev_name, cur, sizeof(np->dev_name));
+ if (!mac_pton(cur, np->local_mac))
+ strlcpy(np->dev_name, cur, sizeof(np->dev_name));
cur = delim;
}
cur++;
@@ -806,8 +808,21 @@ int netpoll_setup(struct netpoll *np)
struct in_device *in_dev;
int err;
- if (np->dev_name)
+ if (!is_zero_ether_addr(np->local_mac)) {
+ rcu_read_lock();
+ ndev = dev_getbyhwaddr_rcu(&init_net, ARPHRD_ETHER, np->local_mac);
+ if (!ndev) {
+ rcu_read_unlock();
+ np_err(np, "%pM doesn't exist, aborting\n", np->local_mac);
+ return -ENODEV;
+ }
+ dev_hold(ndev);
+ rcu_read_unlock();
+ strlcpy(np->dev_name, ndev->name, IFNAMSIZ);
+
+ } else if (np->dev_name)
ndev = dev_get_by_name(&init_net, np->dev_name);
+
if (!ndev) {
np_err(np, "%s doesn't exist, aborting\n", np->dev_name);
return -ENODEV;
--
1.8.0.1
^ permalink raw reply related
* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Eric W. Biederman @ 2012-12-12 21:11 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: netdev, davem, aatteka
In-Reply-To: <50C8EEF0.2010201@6wind.com>
Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
> Le 12/12/2012 20:25, Eric W. Biederman a écrit :
>> Short answer you don't need to do any of this.
>>
>> setns with the namespace files in /proc/<pid>/ns/net gives you more than
>> enough mechanism to solve this problem. And iprout2 already supports
>> all of this.
>>
>> And your approach creates very serious maintenances problems to the
>> point I don't even want to read your patches. What namespace do your
>> namespace id's live in?
>>
>> A socketopt to change the namespace of a socket is nasty because sockets
>> changing which network namespace they are in, leads to races which
>> aren't worth thinking about writing the code to handle.
>>
>> Longer answer.
>>
>> You can bind mount the namespace id's /proc/<pid>/ns/net files to
>> give you any name you want. This puts naming policy in userspace
>> control, and nests just fine.
>>
>> You can open a socket in any network namespace you want just
>> by calling setns before socket. Wrapping this idiom in a library call
>> or if there is sufficient need in a socketat system call seems
>> reasonable.
> Yes, I agree that this SO_NETNS may be a bad idea.
>
>>
>> There is a classic question of if two network namespace files refer to
>> the same network namespace and I have code in linux-next and my pull
>> request to Linus to give those files a unique inode number.
> Interesseting to know that.
>
>>
>> So please use the facilities already merged into the kernel.
> Ok, but how can a daemon get the list of netns? Suppose that we want that
> quagga manage all netns, how can it get this list to open needed netlink
> socket?
>
> For example, iproute2 is only aware of netns created with iproute2, but it
> will no detect other netns.
iproute2 is only aware of network namespaces created with the convention
that iproute2 uses.
If you want other network namespaces to be visible globally use the same
or a similar convention. All iproute2 does is
"mount --bind /proc/<pid>/ns/net /var/run/netns/<name>". So this
convention is not hard to follow.
It is very wrong to presume that without context you know the reason for
the exsitence of any network namespace and that you should or even that
you can manage it. Think of running your multi-network namespace
managing application in a container.
Eric
^ permalink raw reply
* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Eric W. Biederman @ 2012-12-12 21:48 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: netdev, davem, aatteka
In-Reply-To: <87zk1jht7d.fsf@xmission.com>
ebiederm@xmission.com (Eric W. Biederman) writes:
> It is very wrong to presume that without context you know the reason for
> the exsitence of any network namespace and that you should or even that
> you can manage it. Think of running your multi-network namespace
> managing application in a container.
A good example of a network namespace you don't want to mess with are
the network namespaces created by vsftp and chrome for security purposes
to remove any possibility of creating new connections to the network.
Eric
^ permalink raw reply
* Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: Ani Sinha @ 2012-12-12 21:53 UTC (permalink / raw)
To: Michael Richardson; +Cc: netdev, tcpdump-workers, Francesco Ruggeri
In-Reply-To: <21992.1351723328@obiwan.sandelman.ca>
>
> unsigned int netdev_8021q_inskb = 1;
>
> ...
> {
> .ctl_name = NET_CORE_8021q_INSKB,
> .procname = "netdev_8021q_inskb",
> .data = &netdev_8021q_inskb,
> .maxlen = sizeof(int),
> .mode = 0444,
> .proc_handler = proc_dointvec
> },
>
> would seem to do it to me.
> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
> finds it, and it is >0, then do the cmsg thing.
>
Does this work? This is just an experimental patch and by no means final.
I just want to have an idea what everyone thought about it. Once we debate
and discusss, I can cook up a final patch that would be worth commiting.
Also instead of having this /proc interface, we can perhaps check for a
specific
kernel version that :
(a) has the vlan tag info in the skb metadata (as opposed to in the packet
itself)
(b) has the following patch that adds the capability to generate a filter
based on the tag value :
commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1
Author: Eric Dumazet <edumazet@google.com>
Date: Sat Oct 27 02:26:17 2012 +0000
net: filter: add vlan tag access
WE need both of the above two things for the userland to generate a filter
code that compares vlan tag values in the skb metadata. For kernels that
has the vlan tag in
the skb metadata but does not have the above commit (b), there is nothing
that can be done. For older kernels that had the vlan tag info in the
packet itself, the filter code can be generated differently to look at
specific offsets within the packet (something that libpcap does
currently).
We have already ruled out the idea of generating a filter and trying to
load and see if that fails (see previous emails on this thread).
Hope this makes sense.
diff --git a/include/linux/filter.h b/include/linux/filter.h
index c45eabc..91e2ba3 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp)
return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
}
+extern bool sysctl_8021q_inskb;
extern int sk_filter(struct sock *sk, struct sk_buff *skb);
extern unsigned int sk_run_filter(const struct sk_buff *skb,
const struct sock_filter *filter);
diff --git a/net/core/filter.c b/net/core/filter.c
index c23543c..4f5a657 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -41,6 +41,8 @@
#include <linux/seccomp.h>
#include <linux/if_vlan.h>
+bool sysctl_8021q_inskb = 1;
+
/* No hurry in this branch
*
* Exported for the bpf jit load helper.
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d1b0804..f9a3700 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -15,6 +15,7 @@
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/kmemleak.h>
+#include <linux/filter.h>
#include <net/ip.h>
#include <net/sock.h>
@@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec
},
+ {
+ .procname = "8021q_inskb",
+ .data = &sysctl_8021q_inskb,
+ .maxlen = sizeof(bool),
+ .mode = 0444,
+ .proc_handler = proc_dointvec
+ },
{ }
};
^ permalink raw reply related
* Re: [PATCH] net: filter: return -EINVAL if BPF_S_ANC* operation is not supported
From: Ani Sinha @ 2012-12-12 22:06 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50C8B008.2000804@redhat.com>
On Wed, Dec 12, 2012 at 8:25 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 12/12/2012 01:22 PM, Eric Dumazet wrote:
>>
>> On Wed, 2012-12-12 at 10:31 +0100, Daniel Borkmann wrote:
>>>
>>> Currently, we return -EINVAL for malicious or wrong BPF filters.
>>> However, this is not done for BPF_S_ANC* operations, which makes it
>>> more difficult to detect if it's actually supported or not by the
>>> BPF machine. Therefore, we should also return -EINVAL if K is within
>>> the SKF_AD_OFF universe and the ancillary operation did not match.
>>>
>>> Cc: Ani Sinha <ani@aristanetworks.com>
>>> Cc: Eric Dumazet <eric.dumazet@gmail.com>
>>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>>> ---
>>> net/core/filter.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index c23543c..de9bed4 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -531,7 +531,7 @@ int sk_chk_filter(struct sock_filter *filter,
>>> unsigned int flen)
>>> [BPF_JMP|BPF_JSET|BPF_K] = BPF_S_JMP_JSET_K,
>>> [BPF_JMP|BPF_JSET|BPF_X] = BPF_S_JMP_JSET_X,
>>> };
>>> - int pc;
>>> + int pc, anc_found;
>>>
>>> if (flen == 0 || flen > BPF_MAXINSNS)
>>> return -EINVAL;
>>> @@ -592,8 +592,10 @@ int sk_chk_filter(struct sock_filter *filter,
>>> unsigned int flen)
>>> case BPF_S_LD_W_ABS:
>>> case BPF_S_LD_H_ABS:
>>> case BPF_S_LD_B_ABS:
>>> + anc_found = 0;
>>> #define ANCILLARY(CODE) case SKF_AD_OFF + SKF_AD_##CODE: \
>>> code = BPF_S_ANC_##CODE; \
>>> + anc_found = 1; \
>>> break
>>> switch (ftest->k) {
>>> ANCILLARY(PROTOCOL);
>>> @@ -610,6 +612,10 @@ int sk_chk_filter(struct sock_filter *filter,
>>> unsigned int flen)
>>> ANCILLARY(VLAN_TAG);
>>> ANCILLARY(VLAN_TAG_PRESENT);
>>> }
>>> +
>>> + /* ancillary operation unkown or unsupported */
>>> + if (anc_found == 0 && ftest->k >= SKF_AD_OFF)
>>> + return -EINVAL;
>>> }
>>> ftest->code = code;
>>> }
>>
>>
>> Several points :
>>
>> 1) This might break a userland filter that was previously working, by
>> returning 0 when load_pointer() returns NULL.
>>
>> Specifying an offset bigger than skb->len is not _invalid_, it only
>> makes a filter returns 0, because load_pointer() returns NULL.
>
>
> I think it will not break for code, that calls load_pointer() in such a
> circumstance which passed the sk_chk_filter() test. However, it will
> "break" for code that calls ...
>
> { BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
>
> ... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is not an ancillary.
>
> But ...
>
> Assuming some old code will have such an instruction where <K> is between
> [0xfffff000, 0xffffffff] and it doesn't know ancillary operations, then
> this will give a non-expected/unwanted behavior as well (since we do not
> return the BPF machine with 0 as it probably was the case before anc.ops,
> but load sth. into the accumulator instead and continue with the next
> instruction, for instance), right? Thus, following this argumentation, user
> space code would already have been broken by introducing ancillary
> operations into the BPF machine per se.
>
> This is probably just an assumption, but code that does such a direct load,
> e.g. "load word at packet offset 0xffffffff into accumulator" ("ld
> [0xffffffff]")
> is quite broken, isn't it? Isn't the whole assumption of ancillary
> operations
> that no-one intentionally calls things like "ld [0xffffffff]" and expect
> this
> word to be loaded from the packet offset?
>
>
>> 2) This wont help applications running on old kernels where your patch
>> wont be applied, as already mentioned yesterday.
>
>
> Agreed, but leaving old kernels aside, it would be nice if newer kernels
> could validate that, so at least from kernel <xyz> onwards it could be
> checked _for sure_ if anc.op <abc> is present and can be used.
>
I second that. It would be nice to have a clean way to know whether a
particular ancilliary operation is supported by the kernel. After all,
the latest kernel of today will be ancient one soon enough ;)
^ permalink raw reply
* Re: vlan tagged packets and libpcap breakage
From: Ani Sinha @ 2012-12-12 22:16 UTC (permalink / raw)
To: Michael Richardson, Eric W. Biederman
Cc: netdev, Francesco Ruggeri, tcpdump-workers
In-Reply-To: <alpine.OSX.2.00.1212121205040.78903@animac.local>
+ Eric B.
On Wed, Dec 12, 2012 at 1:53 PM, Ani Sinha <ani@aristanetworks.com> wrote:
>
>>
>> unsigned int netdev_8021q_inskb = 1;
>>
>> ...
>> {
>> .ctl_name = NET_CORE_8021q_INSKB,
>> .procname = "netdev_8021q_inskb",
>> .data = &netdev_8021q_inskb,
>> .maxlen = sizeof(int),
>> .mode = 0444,
>> .proc_handler = proc_dointvec
>> },
>>
>> would seem to do it to me.
>> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
>> finds it, and it is >0, then do the cmsg thing.
>>
>
> Does this work? This is just an experimental patch and by no means final.
> I just want to have an idea what everyone thought about it. Once we debate
> and discusss, I can cook up a final patch that would be worth commiting.
>
> Also instead of having this /proc interface, we can perhaps check for a
> specific
> kernel version that :
>
> (a) has the vlan tag info in the skb metadata (as opposed to in the packet
> itself)
> (b) has the following patch that adds the capability to generate a filter
> based on the tag value :
>
> commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1
> Author: Eric Dumazet <edumazet@google.com>
> Date: Sat Oct 27 02:26:17 2012 +0000
>
> net: filter: add vlan tag access
>
> WE need both of the above two things for the userland to generate a filter
> code that compares vlan tag values in the skb metadata. For kernels that
> has the vlan tag in
> the skb metadata but does not have the above commit (b), there is nothing
> that can be done. For older kernels that had the vlan tag info in the
> packet itself, the filter code can be generated differently to look at
> specific offsets within the packet (something that libpcap does
> currently).
>
> We have already ruled out the idea of generating a filter and trying to
> load and see if that fails (see previous emails on this thread).
>
> Hope this makes sense.
>
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index c45eabc..91e2ba3 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp)
> return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
> }
>
> +extern bool sysctl_8021q_inskb;
> extern int sk_filter(struct sock *sk, struct sk_buff *skb);
> extern unsigned int sk_run_filter(const struct sk_buff *skb,
> const struct sock_filter *filter);
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c23543c..4f5a657 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -41,6 +41,8 @@
> #include <linux/seccomp.h>
> #include <linux/if_vlan.h>
>
> +bool sysctl_8021q_inskb = 1;
> +
> /* No hurry in this branch
> *
> * Exported for the bpf jit load helper.
> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> index d1b0804..f9a3700 100644
> --- a/net/core/sysctl_net_core.c
> +++ b/net/core/sysctl_net_core.c
> @@ -15,6 +15,7 @@
> #include <linux/init.h>
> #include <linux/slab.h>
> #include <linux/kmemleak.h>
> +#include <linux/filter.h>
>
> #include <net/ip.h>
> #include <net/sock.h>
> @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = {
> .mode = 0644,
> .proc_handler = proc_dointvec
> },
> + {
> + .procname = "8021q_inskb",
> + .data = &sysctl_8021q_inskb,
> + .maxlen = sizeof(bool),
> + .mode = 0444,
> + .proc_handler = proc_dointvec
> + },
> { }
> };
>
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
^ permalink raw reply
* Re: [PATCH 00/11] Add basic VLAN support to bridges
From: Or Gerlitz @ 2012-12-12 22:54 UTC (permalink / raw)
To: Vlad Yasevich; +Cc: netdev, shemminger, davem, mst, john.r.fastabend
In-Reply-To: <1355342477-4971-1-git-send-email-vyasevic@redhat.com>
On Wed, Dec 12, 2012 at 10:01 PM, Vlad Yasevich <vyasevic@redhat.com> wrote:
> This series of patches provides an ability to add VLANs to the bridge
> ports. This is similar to what can be found in most switches.
Vlad, I wasn't sure if these patches support both modes of switches
w.r.t vlans namely "access" and " trunk" or in virtualization terms
VST and VGT or in natural language, both the mode where the entity
(e.g VM) eventually using the bridge port uses untagged traffic and
the bridge does vlan tagging/marking and vlan untagging/stripping,
plus a mode where packets are tagged under a set of allowed vlans or a
third hybrid mode where there's a default vlan to be used when packets
arrive untagged and set of allowed vlans to be used as a filter for
tagged packets.
Also, does this patch set assumes that a certain port is actually an
uplink towards the the physical nework/external switch?
Or.
> The bridge
> port may have any number of VLANs added to it including vlan 0 priority tagged
> traffic. When vlans are added to the port, only traffic tagged with particular
> vlan will forwarded over this port. Additionally, vlan ids are added to FDB
> entries and become part of the lookup. This way we correctly identify the FDB
> entry.
>
> A single vlan may also be designated as untagged. Any untagged traffic
> recieved by the port will be assigned to this vlan. Any traffic exiting
> the port with a VID matching the untagged vlan will exit untagged (the
> bridge will strip the vlan header). This is similar to "Native Vlan" support
> available in most switches.
>
> The default behavior ofthe bridge is unchanged if no vlans have been
> configured.
>
> Changes since rfc v2:
> - Per-port vlan bitmap is gone and is replaced with a vlan list.
> - Added bridge vlan list, which is referenced by each port. Entries in
> the birdge vlan list have port bitmap that shows which port are parts
> of which vlan.
> - Netlink API changes.
> - Dropped sysfs support for now. If people think this is really usefull,
> can add it back.
> - Support for native/untagged vlans.
>
> Changes since rfc v1:
> - Comments addressed regarding formatting and RCU usage
> - iocts have been removed and changed over the netlink interface.
> - Added support of user added ndb entries.
> - changed sysfs interface to export a bitmap. Also added a write interface.
> I am not sure how much I like it, but it made my testing easier/faster. I
> might change the write interface to take text instead of binary.
>
> Vlad Yasevich (11):
> bridge: Add vlan filtering infrastructure
> bridge: Validate that vlan is permitted on ingress
> bridge: Verify that a vlan is allowed to egress on give port
> bridge: Cache vlan in the cb for faster egress lookup.
> bridge: Add vlan to unicast fdb entries
> bridge: Add vlan id to multicast groups
> bridge: Add netlink interface to configure vlans on bridge ports
> bridge: Add vlan support to static neighbors
> bridge: Add the ability to configure untagged vlans
> bridge: Implement untagged vlan handling
> bridge: Dump vlan information from a bridge port
>
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 +-
> drivers/net/macvlan.c | 2 +-
> drivers/net/vxlan.c | 3 +-
> include/linux/netdevice.h | 4 +-
> include/uapi/linux/if_bridge.h | 24 ++-
> include/uapi/linux/neighbour.h | 1 +
> include/uapi/linux/rtnetlink.h | 1 +
> net/bridge/br_device.c | 34 +++-
> net/bridge/br_fdb.c | 199 +++++++++++++---
> net/bridge/br_forward.c | 139 +++++++++++
> net/bridge/br_if.c | 312 +++++++++++++++++++++++++
> net/bridge/br_input.c | 65 +++++-
> net/bridge/br_multicast.c | 71 ++++--
> net/bridge/br_netlink.c | 154 +++++++++++--
> net/bridge/br_private.h | 66 +++++-
> net/core/rtnetlink.c | 40 +++-
> 16 files changed, 1010 insertions(+), 110 deletions(-)
>
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Any good documentation on RTNL
From: Ben Greear @ 2012-12-12 23:15 UTC (permalink / raw)
To: netdev
I'm wondering if anyone could point me to some documentation on
the finer points of what the rtnl_lock() does? I can't find anything
overly useful in google or the kernel docs.
For instance, can the packet rx-logic run (on other threads?) while rtnl is held?
How about tx-logic?
In particular, I'm interested to know if it is valid to have
this state:
thread 1 holds RTNL, and blocks on thread 2 due to trying to flush a work-queue.
thread 2 is processing an item on that work-queue, and the work item is sending packets
(and blocking for up to 200ms timeout trying to flush a wifi driver's queues).
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [PATCH 00/11] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-12 23:36 UTC (permalink / raw)
To: Or Gerlitz; +Cc: netdev, shemminger, davem, mst, john.r.fastabend
In-Reply-To: <CAJZOPZL3grqtd-Tnhr-QxkdFWrhkgcp5+fOUZSaPJd3Ker_7GQ@mail.gmail.com>
On 12/12/2012 05:54 PM, Or Gerlitz wrote:
> On Wed, Dec 12, 2012 at 10:01 PM, Vlad Yasevich <vyasevic@redhat.com> wrote:
>> This series of patches provides an ability to add VLANs to the bridge
>> ports. This is similar to what can be found in most switches.
>
> Vlad, I wasn't sure if these patches support both modes of switches
> w.r.t vlans namely "access" and " trunk" or in virtualization terms
> VST and VGT or in natural language, both the mode where the entity
> (e.g VM) eventually using the bridge port uses untagged traffic and
> the bridge does vlan tagging/marking and vlan untagging/stripping,
> plus a mode where packets are tagged under a set of allowed vlans or a
> third hybrid mode where there's a default vlan to be used when packets
> arrive untagged and set of allowed vlans to be used as a filter for
> tagged packets.
The patches are generic enough that they can support all three. Its
just a matter of configuration.
If the entity using the switch is expecting untagged traffic for a
particular vlan, you can designate that vlan as untagged or native, and
the bridge will strip the headers. If you want more then one untagged
vlan, then you have configure vlan interfaces under the bridge and
bridge them together.
The patch will also insert a VLAN tag on port if that is how the port
is configured.
There 2 things I don't do: Q-in-Q (but there is nothing stopping it,
just didn't write the code), and vlan translation (that would be a
headache). I also don't set priorities yet, but that can be added later
if it is something people want.
>
> Also, does this patch set assumes that a certain port is actually an
> uplink towards the the physical nework/external switch?
No, there is no uplink designation yet. It might be useful for some
other work I am thinking of, but it wasn't really needed here.
-vlad
>
> Or.
>
>> The bridge
>> port may have any number of VLANs added to it including vlan 0 priority tagged
>> traffic. When vlans are added to the port, only traffic tagged with particular
>> vlan will forwarded over this port. Additionally, vlan ids are added to FDB
>> entries and become part of the lookup. This way we correctly identify the FDB
>> entry.
>>
>> A single vlan may also be designated as untagged. Any untagged traffic
>> recieved by the port will be assigned to this vlan. Any traffic exiting
>> the port with a VID matching the untagged vlan will exit untagged (the
>> bridge will strip the vlan header). This is similar to "Native Vlan" support
>> available in most switches.
>>
>> The default behavior ofthe bridge is unchanged if no vlans have been
>> configured.
>>
>> Changes since rfc v2:
>> - Per-port vlan bitmap is gone and is replaced with a vlan list.
>> - Added bridge vlan list, which is referenced by each port. Entries in
>> the birdge vlan list have port bitmap that shows which port are parts
>> of which vlan.
>> - Netlink API changes.
>> - Dropped sysfs support for now. If people think this is really usefull,
>> can add it back.
>> - Support for native/untagged vlans.
>>
>> Changes since rfc v1:
>> - Comments addressed regarding formatting and RCU usage
>> - iocts have been removed and changed over the netlink interface.
>> - Added support of user added ndb entries.
>> - changed sysfs interface to export a bitmap. Also added a write interface.
>> I am not sure how much I like it, but it made my testing easier/faster. I
>> might change the write interface to take text instead of binary.
>>
>> Vlad Yasevich (11):
>> bridge: Add vlan filtering infrastructure
>> bridge: Validate that vlan is permitted on ingress
>> bridge: Verify that a vlan is allowed to egress on give port
>> bridge: Cache vlan in the cb for faster egress lookup.
>> bridge: Add vlan to unicast fdb entries
>> bridge: Add vlan id to multicast groups
>> bridge: Add netlink interface to configure vlans on bridge ports
>> bridge: Add vlan support to static neighbors
>> bridge: Add the ability to configure untagged vlans
>> bridge: Implement untagged vlan handling
>> bridge: Dump vlan information from a bridge port
>>
>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 +-
>> drivers/net/macvlan.c | 2 +-
>> drivers/net/vxlan.c | 3 +-
>> include/linux/netdevice.h | 4 +-
>> include/uapi/linux/if_bridge.h | 24 ++-
>> include/uapi/linux/neighbour.h | 1 +
>> include/uapi/linux/rtnetlink.h | 1 +
>> net/bridge/br_device.c | 34 +++-
>> net/bridge/br_fdb.c | 199 +++++++++++++---
>> net/bridge/br_forward.c | 139 +++++++++++
>> net/bridge/br_if.c | 312 +++++++++++++++++++++++++
>> net/bridge/br_input.c | 65 +++++-
>> net/bridge/br_multicast.c | 71 ++++--
>> net/bridge/br_netlink.c | 154 +++++++++++--
>> net/bridge/br_private.h | 66 +++++-
>> net/core/rtnetlink.c | 40 +++-
>> 16 files changed, 1010 insertions(+), 110 deletions(-)
>>
>> --
>> 1.7.7.6
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Any good documentation on RTNL
From: Ben Hutchings @ 2012-12-12 23:37 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
In-Reply-To: <50C9101A.6000906@candelatech.com>
On Wed, 2012-12-12 at 15:15 -0800, Ben Greear wrote:
> I'm wondering if anyone could point me to some documentation on
> the finer points of what the rtnl_lock() does? I can't find anything
> overly useful in google or the kernel docs.
>
> For instance, can the packet rx-logic run (on other threads?) while rtnl is held?
>
> How about tx-logic?
rtnl_lock() is just mutex_lock() on a particular global mutex. Since
the RX and TX paths obviously don't take such a mutex, it has no effect
on them.
All rtnetlink operations and most net device ioctls are serialised by
this mutex (it's the BKL of networking!).
> In particular, I'm interested to know if it is valid to have
> this state:
>
> thread 1 holds RTNL, and blocks on thread 2 due to trying to flush a work-queue.
>
> thread 2 is processing an item on that work-queue, and the work item is sending packets
> (and blocking for up to 200ms timeout trying to flush a wifi driver's queues).
So long as the workqueue is private (if you're flushing it, I suppose it
must be) then I don't see any problem with that.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH] netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE
From: Andrew Collins @ 2012-12-12 23:49 UTC (permalink / raw)
To: netfilter-devel, netdev, kadlec
The MASQUERADE target now handles routing changes which affect
the output interface of a connection, but only for ESTABLISHED
connections. It is also possible for NEW connections which
already have a conntrack entry to be affected by routing changes.
This adds a check to drop entries in the NEW+conntrack state
when the oif has changed.
Signed-off-by: Andrew Collins <bsderandrew@gmail.com>
---
net/ipv4/netfilter/iptable_nat.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c
index da2c8a3..eeaff7e 100644
--- a/net/ipv4/netfilter/iptable_nat.c
+++ b/net/ipv4/netfilter/iptable_nat.c
@@ -124,23 +124,28 @@ nf_nat_ipv4_fn(unsigned int hooknum,
ret = nf_nat_rule_find(skb, hooknum, in, out, ct);
if (ret != NF_ACCEPT)
return ret;
- } else
+ } else {
pr_debug("Already setup manip %s for ct %p\n",
maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
ct);
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
+ }
break;
default:
/* ESTABLISHED */
NF_CT_ASSERT(ctinfo == IP_CT_ESTABLISHED ||
ctinfo == IP_CT_ESTABLISHED_REPLY);
- if (nf_nat_oif_changed(hooknum, ctinfo, nat, out)) {
- nf_ct_kill_acct(ct, ctinfo, skb);
- return NF_DROP;
- }
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
}
return nf_nat_packet(ct, ctinfo, hooknum, skb);
+
+oif_changed:
+ nf_ct_kill_acct(ct, ctinfo, skb);
+ return NF_DROP;
}
static unsigned int
--
1.7.1
^ permalink raw reply related
* Re: [PATCH] netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE
From: Andrew Collins @ 2012-12-13 0:17 UTC (permalink / raw)
To: netfilter-devel, netdev, kadlec
In-Reply-To: <1355356167-10397-1-git-send-email-bsderandrew@gmail.com>
On Wed, Dec 12, 2012 at 4:49 PM, Andrew Collins <bsderandrew@gmail.com> wrote:
> The MASQUERADE target now handles routing changes which affect
> the output interface of a connection, but only for ESTABLISHED
> connections. It is also possible for NEW connections which
> already have a conntrack entry to be affected by routing changes.
>
> This adds a check to drop entries in the NEW+conntrack state
> when the oif has changed.
>
> Signed-off-by: Andrew Collins <bsderandrew@gmail.com>
> ---
> net/ipv4/netfilter/iptable_nat.c | 15 ++++++++++-----
> 1 files changed, 10 insertions(+), 5 deletions(-)
My mistake, I forgot to include the corresponding ip6table_nat.c
change (it's identical), ignore this for now.
^ permalink raw reply
* [PATCH v2] netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE
From: Andrew Collins @ 2012-12-13 0:23 UTC (permalink / raw)
To: netfilter-devel, netdev, kadlec
The MASQUERADE target now handles routing changes which affect
the output interface of a connection, but only for ESTABLISHED
connections. It is also possible for NEW connections which
already have a conntrack entry to be affected by routing changes.
This adds a check to drop entries in the NEW+conntrack state
when the oif has changed.
Signed-off-by: Andrew Collins <bsderandrew@gmail.com>
---
net/ipv4/netfilter/iptable_nat.c | 15 ++++++++++-----
net/ipv6/netfilter/ip6table_nat.c | 15 ++++++++++-----
2 files changed, 20 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c
index da2c8a3..eeaff7e 100644
--- a/net/ipv4/netfilter/iptable_nat.c
+++ b/net/ipv4/netfilter/iptable_nat.c
@@ -124,23 +124,28 @@ nf_nat_ipv4_fn(unsigned int hooknum,
ret = nf_nat_rule_find(skb, hooknum, in, out, ct);
if (ret != NF_ACCEPT)
return ret;
- } else
+ } else {
pr_debug("Already setup manip %s for ct %p\n",
maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
ct);
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
+ }
break;
default:
/* ESTABLISHED */
NF_CT_ASSERT(ctinfo == IP_CT_ESTABLISHED ||
ctinfo == IP_CT_ESTABLISHED_REPLY);
- if (nf_nat_oif_changed(hooknum, ctinfo, nat, out)) {
- nf_ct_kill_acct(ct, ctinfo, skb);
- return NF_DROP;
- }
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
}
return nf_nat_packet(ct, ctinfo, hooknum, skb);
+
+oif_changed:
+ nf_ct_kill_acct(ct, ctinfo, skb);
+ return NF_DROP;
}
static unsigned int
diff --git a/net/ipv6/netfilter/ip6table_nat.c b/net/ipv6/netfilter/ip6table_nat.c
index 6c8ae24..e0e788d 100644
--- a/net/ipv6/netfilter/ip6table_nat.c
+++ b/net/ipv6/netfilter/ip6table_nat.c
@@ -127,23 +127,28 @@ nf_nat_ipv6_fn(unsigned int hooknum,
ret = nf_nat_rule_find(skb, hooknum, in, out, ct);
if (ret != NF_ACCEPT)
return ret;
- } else
+ } else {
pr_debug("Already setup manip %s for ct %p\n",
maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
ct);
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
+ }
break;
default:
/* ESTABLISHED */
NF_CT_ASSERT(ctinfo == IP_CT_ESTABLISHED ||
ctinfo == IP_CT_ESTABLISHED_REPLY);
- if (nf_nat_oif_changed(hooknum, ctinfo, nat, out)) {
- nf_ct_kill_acct(ct, ctinfo, skb);
- return NF_DROP;
- }
+ if (nf_nat_oif_changed(hooknum, ctinfo, nat, out))
+ goto oif_changed;
}
return nf_nat_packet(ct, ctinfo, hooknum, skb);
+
+oif_changed:
+ nf_ct_kill_acct(ct, ctinfo, skb);
+ return NF_DROP;
}
static unsigned int
--
1.7.1
^ permalink raw reply related
* [net-next:master 2/17] drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1932:17: sparse: incorrect type in initializer (different base types)
From: kbuild test robot @ 2012-12-13 0:30 UTC (permalink / raw)
To: Rasesh Mody; +Cc: netdev
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head: 520dfe3a3645257bf83660f672c47f8558f3d4c4
commit: 5216562a2ccd037d0eb85a2e8bbfd6315e3f1bb5 [2/17] bna: Tx and Rx Optimizations
sparse warnings:
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:283:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:299:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:299:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:299:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:315:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:315:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:315:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:317:21: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:317:21: expected unsigned short [unsigned] [usertype] handle
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:317:21: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:330:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:330:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:330:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:345:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:345:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:345:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:362:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:362:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:362:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:368:42: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:368:42: expected unsigned int [unsigned] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:368:42: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:385:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:385:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:385:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:400:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:400:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:400:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:402:19: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:402:19: expected unsigned short [unsigned] [usertype] size
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:402:19: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:417:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:417:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:417:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:422:33: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:422:33: expected unsigned int [unsigned] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:422:33: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:436:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:436:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:436:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:723:17: sparse: cast to restricted __be16
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:723:17: sparse: cast to restricted __be16
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:723:17: sparse: cast to restricted __be16
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:723:17: sparse: cast to restricted __be16
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1650:33: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1650:33: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1650:33: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: expected unsigned short [unsigned] [usertype] pages
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: expected unsigned short [unsigned] [usertype] page_sz
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1664:25: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1666:61: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1666:61: expected unsigned short [unsigned] [usertype] rx_buffer_size
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1666:61: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: expected unsigned short [unsigned] [usertype] pages
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: expected unsigned short [unsigned] [usertype] page_sz
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1672:25: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1676:61: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1676:61: expected unsigned short [unsigned] [usertype] rx_buffer_size
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1676:61: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: expected unsigned short [unsigned] [usertype] pages
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: expected unsigned short [unsigned] [usertype] page_sz
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1684:17: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1691:54: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1691:54: expected unsigned short [unsigned] [usertype] msix_index
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1691:54: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1702:44: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1702:44: expected unsigned int [unsigned] [usertype] coalescing_timeout
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1702:44: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1704:43: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1704:43: expected unsigned int [unsigned] [usertype] inter_pkt_timeout
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1704:43: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1406:1: sparse: symbol 'bna_rx_sm_stop_wait_entry' was not declared. Should it be static?
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1459:1: sparse: symbol 'bna_rx_sm_rxf_stop_wait_entry' was not declared. Should it be static?
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1492:1: sparse: symbol 'bna_rx_sm_started_entry' was not declared. Should it be static?
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1557:1: sparse: symbol 'bna_rx_sm_cleanup_wait_entry' was not declared. Should it be static?
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1562:1: sparse: symbol 'bna_rx_sm_cleanup_wait' was not declared. Should it be static?
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1741:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1741:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1741:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1926:9: sparse: cast to restricted __be32
+ drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1932:17: sparse: incorrect type in initializer (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1932:17: expected unsigned long long [unsigned] [usertype] tmp_addr
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1932:17: got restricted __be64 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1964:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1970:17: sparse: incorrect type in initializer (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1970:17: expected unsigned long long [unsigned] [usertype] tmp_addr
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:1970:17: got restricted __be64 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2185:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2189:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:2194:27: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3168:33: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3168:33: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3168:33: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: expected unsigned short [unsigned] [usertype] pages
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: expected unsigned short [unsigned] [usertype] page_sz
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3177:17: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3184:54: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3184:54: expected unsigned short [unsigned] [usertype] msix_index
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3184:54: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3194:44: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3194:44: expected unsigned int [unsigned] [usertype] coalescing_timeout
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3194:44: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3196:43: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3196:43: expected unsigned int [unsigned] [usertype] inter_pkt_timeout
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3196:43: got restricted __be32 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3201:33: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3201:33: expected unsigned short [unsigned] [usertype] vlan_id
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3201:33: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3217:29: sparse: incorrect type in assignment (different base types)
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3217:29: expected unsigned short [unsigned] [usertype] num_entries
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3217:29: got restricted __be16 [usertype] <noident>
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: cast to restricted __be32
+ drivers/net/ethernet/brocade/bna/bna_tx_rx.c:3260:9: sparse: too many warnings
vim +1932 drivers/net/ethernet/brocade/bna/bna_tx_rx.c
f3bd5173 Rasesh Mody 2011-08-08 1920 rxq->qpt.page_size = page_size;
f3bd5173 Rasesh Mody 2011-08-08 1921
f3bd5173 Rasesh Mody 2011-08-08 1922 rxq->rcb->sw_qpt = (void **) swqpt_mem->kva;
5216562a Rasesh Mody 2012-12-11 1923 rxq->rcb->sw_q = page_mem->kva;
5216562a Rasesh Mody 2012-12-11 1924
5216562a Rasesh Mody 2012-12-11 1925 kva = page_mem->kva;
5216562a Rasesh Mody 2012-12-11 @1926 BNA_GET_DMA_ADDR(&page_mem->dma, dma);
f3bd5173 Rasesh Mody 2011-08-08 1927
f3bd5173 Rasesh Mody 2011-08-08 1928 for (i = 0; i < rxq->qpt.page_count; i++) {
5216562a Rasesh Mody 2012-12-11 1929 rxq->rcb->sw_qpt[i] = kva;
5216562a Rasesh Mody 2012-12-11 1930 kva += PAGE_SIZE;
5216562a Rasesh Mody 2012-12-11 1931
5216562a Rasesh Mody 2012-12-11 @1932 BNA_SET_DMA_ADDR(dma, &bna_dma);
f3bd5173 Rasesh Mody 2011-08-08 1933 ((struct bna_dma_addr *)rxq->qpt.kv_qpt_ptr)[i].lsb =
5216562a Rasesh Mody 2012-12-11 1934 bna_dma.lsb;
f3bd5173 Rasesh Mody 2011-08-08 1935 ((struct bna_dma_addr *)rxq->qpt.kv_qpt_ptr)[i].msb =
---
0-DAY kernel build testing backend Open Source Technology Center
Fengguang Wu, Yuanhan Liu Intel Corporation
^ permalink raw reply
* [net-next:master 14/17] net/bridge/br_multicast.c:677:54: sparse: incorrect type in argument 3 (different address spaces)
From: kbuild test robot @ 2012-12-13 0:51 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head: 520dfe3a3645257bf83660f672c47f8558f3d4c4
commit: cfd567543590f71ca0af397437e2554f9756d750 [14/17] bridge: add support of adding and deleting mdb entries
sparse warnings:
net/bridge/br_multicast.c:635:17: sparse: incorrect type in assignment (different address spaces)
net/bridge/br_multicast.c:635:17: expected struct net_bridge_port_group [noderef] <asn:4>*next
net/bridge/br_multicast.c:635:17: got struct net_bridge_port_group *next
+ net/bridge/br_multicast.c:677:54: sparse: incorrect type in argument 3 (different address spaces)
net/bridge/br_multicast.c:677:54: expected struct net_bridge_port_group *next
net/bridge/br_multicast.c:677:54: got struct net_bridge_port_group [noderef] <asn:4>*<noident>
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
net/bridge/br_multicast.c:1175:48: sparse: restricted __be16 degrades to integer
vim +677 net/bridge/br_multicast.c
cfd56754 Cong Wang 2012-12-11 629 p = kzalloc(sizeof(*p), GFP_ATOMIC);
cfd56754 Cong Wang 2012-12-11 630 if (unlikely(!p))
cfd56754 Cong Wang 2012-12-11 631 return NULL;
cfd56754 Cong Wang 2012-12-11 632
cfd56754 Cong Wang 2012-12-11 633 p->addr = *group;
cfd56754 Cong Wang 2012-12-11 634 p->port = port;
cfd56754 Cong Wang 2012-12-11 @635 p->next = next;
cfd56754 Cong Wang 2012-12-11 636 hlist_add_head(&p->mglist, &port->mglist);
cfd56754 Cong Wang 2012-12-11 637 setup_timer(&p->timer, br_multicast_port_group_expired,
cfd56754 Cong Wang 2012-12-11 638 (unsigned long)p);
cfd56754 Cong Wang 2012-12-11 639 return p;
cfd56754 Cong Wang 2012-12-11 640 }
cfd56754 Cong Wang 2012-12-11 641
eb1d1641 Herbert Xu 2010-02-27 642 static int br_multicast_add_group(struct net_bridge *br,
8ef2a9a5 YOSHIFUJI Hideaki 2010-04-18 643 struct net_bridge_port *port,
8ef2a9a5 YOSHIFUJI Hideaki 2010-04-18 644 struct br_ip *group)
eb1d1641 Herbert Xu 2010-02-27 645 {
eb1d1641 Herbert Xu 2010-02-27 646 struct net_bridge_mdb_entry *mp;
eb1d1641 Herbert Xu 2010-02-27 647 struct net_bridge_port_group *p;
e8051688 Eric Dumazet 2010-11-15 648 struct net_bridge_port_group __rcu **pp;
eb1d1641 Herbert Xu 2010-02-27 649 unsigned long now = jiffies;
eb1d1641 Herbert Xu 2010-02-27 650 int err;
eb1d1641 Herbert Xu 2010-02-27 651
eb1d1641 Herbert Xu 2010-02-27 652 spin_lock(&br->multicast_lock);
eb1d1641 Herbert Xu 2010-02-27 653 if (!netif_running(br->dev) ||
eb1d1641 Herbert Xu 2010-02-27 654 (port && port->state == BR_STATE_DISABLED))
eb1d1641 Herbert Xu 2010-02-27 655 goto out;
eb1d1641 Herbert Xu 2010-02-27 656
eb1d1641 Herbert Xu 2010-02-27 657 mp = br_multicast_new_group(br, port, group);
eb1d1641 Herbert Xu 2010-02-27 658 err = PTR_ERR(mp);
4c0833bc Tobias Klauser 2010-12-10 659 if (IS_ERR(mp))
eb1d1641 Herbert Xu 2010-02-27 660 goto err;
eb1d1641 Herbert Xu 2010-02-27 661
eb1d1641 Herbert Xu 2010-02-27 662 if (!port) {
8a870178 Herbert Xu 2011-02-12 663 mp->mglist = true;
eb1d1641 Herbert Xu 2010-02-27 664 mod_timer(&mp->timer, now + br->multicast_membership_interval);
eb1d1641 Herbert Xu 2010-02-27 665 goto out;
eb1d1641 Herbert Xu 2010-02-27 666 }
eb1d1641 Herbert Xu 2010-02-27 667
e8051688 Eric Dumazet 2010-11-15 668 for (pp = &mp->ports;
e8051688 Eric Dumazet 2010-11-15 669 (p = mlock_dereference(*pp, br)) != NULL;
e8051688 Eric Dumazet 2010-11-15 670 pp = &p->next) {
eb1d1641 Herbert Xu 2010-02-27 671 if (p->port == port)
eb1d1641 Herbert Xu 2010-02-27 672 goto found;
eb1d1641 Herbert Xu 2010-02-27 673 if ((unsigned long)p->port < (unsigned long)port)
eb1d1641 Herbert Xu 2010-02-27 674 break;
eb1d1641 Herbert Xu 2010-02-27 675 }
eb1d1641 Herbert Xu 2010-02-27 676
cfd56754 Cong Wang 2012-12-11 @677 p = br_multicast_new_port_group(port, group, *pp);
eb1d1641 Herbert Xu 2010-02-27 678 if (unlikely(!p))
eb1d1641 Herbert Xu 2010-02-27 679 goto err;
eb1d1641 Herbert Xu 2010-02-27 680 rcu_assign_pointer(*pp, p);
---
0-DAY kernel build testing backend Open Source Technology Center
Fengguang Wu, Yuanhan Liu Intel Corporation
^ permalink raw reply
* RFC: Launch Time Support
From: Ulf Samuelsson @ 2012-12-13 1:04 UTC (permalink / raw)
To: netdev
Hi, I am looking for some feedback on how to implement launchtime
in the kernel.
I.E: You define WHEN you want to send a packet,
and the driver will store the packet in a buffer and will send it out
on the net when the internal timestamp counter in the network controller
reaches the specified "launch time".
Some Ethernet controllers like the new Intel i210 support "launch time",
Support for launch time is desirable for any isochronous connection,
but I am currently interested in the NTP protocol to improve the timing.
Proposed Changes to the Kernel
===========================================================
The launchtime support will be dependent on CONFIG_NET_LAUNCHTIME
If this is not set, then the kernel functionality is not changed.
My current idea is to add a new bit to the "flags" field of
"socket.c:sendto"
#define MSG_LAUNCHTIME 0x?????
struct msghdr gets an additional launchtime field.
sendto will check if the flags parameter contains MSG_LAUNCHTIME.
If it does, then the first 64 bit longword of the packet (buff) contains
the launchtime.
The launchtime from the buffer is copied to the msghdr.launchtime field,
and the first 64 bits of the packet is then shaved off, before the address
is written to the msghdr.
Each network controller supporting launchtime needs to have an alternative
call to "send packet with launchtime" . This call adds the launchtime
parameter.
If launchtime is supported the exported "ops" includes the new call.
The UDP/IP packet send will check the MSG_LAUNCHTIME and
if set, it will check if the "send packet with launchtime" call
is available for the driver and if so call it, otherwise it will call
the normal send packet and thus ignore the launchtime.
Before launchtime is used, the application should send an ioctl
to the driver, making sure that launchtime is configured,
and only if the driver ACKs , the application will use launchtime.
(Possibly the "ops" field for "send packet with launchtime" should be
NULL until that ioctl is complete. Comments?)
To me, this seems to be transparent for all other network stacks
so protocols and drivers not supporting launchtime should still work.
As far as I know, drivers do not support launch time today.
The Intel igb driver does not in the latest version on the intel web site,
There are some defines headers in the latest version defining the registers
but so far, the code is not using it.
There is the linux_igb_avb project on sourceforge which allows use of
launch time for user space applications, but not as part of the kernel.
Maybe there is more work done somewhere else, but i am not aware
of this, so any links to such work is appreciated.
There are some FPGA based PCIe boards that support launchtime (Endace DAG)
using proprietary APIs.
Talked to some vendors providing TCP/IP offload engines for FPGA
and they do not support launchtime and liuke Endace use proprietary APIs
so they are only useable by custom programs. Normal networking interfaces
are not supported.
Comment on above is appreciated.
BACKGROUND
For those that do not know how the NTP protocol works:
===================================================
The client sends an UDP packet to the NTP server using port 123
The NTP client reads the current systime and puts that in the outgoing
packet.
There is a delay between the time the systime is read, and the time
the packet actually leaves the Ethernet controller adding jitter to the
NTP algorithm.
When the server receives the packet, it can be timestamped in H/W
and a CMSG is then created by the network stack containing that
timestamp for use by the server NTP daemon.
The server generates a reply, which needs to include the client
transmit time, the servers receive time, and the servers transmit time.
Again, the transmit time needs to be written into the NTP packet,
and then it needs to be processed through the network stack before it is
leaving the ethernet controller causing more jitter.
If launch time is supported, then the client NTP daemon would simply
read the systime, add a constant delay to create the transmit timestamp.
The delay needs to be sufficiently large to ensure that all processing
is done,
The server will do something similar adding a constant to the server
receive timestamp
to create the server transmit timestamp.
If both the client and the server uses H/W timestamping and launch time,
then the the jitter ideally is reduced to zero.
TRANSMIT TIMESTAMPING
========================
Support for TX timestamps in H/W is not really useful, since you need to
provide
the TX timestamp in the packet you measure on, so when you know the
timestamp
it is too late. Server to server NTP connections support sending that
timestamp
in a new packet, but there is no such support in client server
communication.
The i210 supports putting the timestamp inside the packet as it leaves the
Ethernet controller, but that means that you screw up the UDP checksum, so
the packet will be rejected by the receiving NTP daemon.
In addition, the i210 timestamp measures seconds and nanoseconds
which is incompatible with the NTP timestamp which uses seconds
and a 32 bit fraction of a second so that does not work either.
Best Regards
Ulf Samuelsson
eMagii.
^ permalink raw reply
* Re: [GIT] Networking
From: Linus Torvalds @ 2012-12-13 2:15 UTC (permalink / raw)
To: David Miller
Cc: Andrew Morton, Network Development, Linux Kernel Mailing List
In-Reply-To: <20121212.151116.143443755590581447.davem@davemloft.net>
On Wed, Dec 12, 2012 at 12:11 PM, David Miller <davem@davemloft.net> wrote:
>
> There is one merge conflict to resolve in net/sched/cls_cgroup.c,
> one commit changes the name of some members to "css_*" (this came
> from Tejun's tree) and another commit adds an "attach" method.
There's more than that. The ARM board mess is apparently now affecting
the networking merges too.
I fixed it up. Hopefully correctly.
Also, why does the new SHA1 hmac cookie support default to 'y'?
Linus
^ permalink raw reply
* Re: [GIT] Networking
From: David Miller @ 2012-12-13 2:27 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
In-Reply-To: <CA+55aFwzUgxQAze=mYbEx8b61V542tzm06Df=mR1BtYVbJy0mg@mail.gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 12 Dec 2012 18:15:04 -0800
> On Wed, Dec 12, 2012 at 12:11 PM, David Miller <davem@davemloft.net> wrote:
>>
>> There is one merge conflict to resolve in net/sched/cls_cgroup.c,
>> one commit changes the name of some members to "css_*" (this came
>> from Tejun's tree) and another commit adds an "attach" method.
>
> There's more than that. The ARM board mess is apparently now affecting
> the networking merges too.
>
> I fixed it up. Hopefully correctly.
>
> Also, why does the new SHA1 hmac cookie support default to 'y'?
There are two SCTP HMAC cookie algorithms, MD5 and SHA1.
What used to happen is that you had to choose one at build
time, and then you were stuck with that decision and it was
all that you could use.
Now, it's selectable at run time.
If there's anything you find particularly anti-social about
this, I'm sure we can adjust it.
^ permalink raw reply
* Re: [GIT] Networking
From: Linus Torvalds @ 2012-12-13 2:37 UTC (permalink / raw)
To: David Miller
Cc: Andrew Morton, Network Development, Linux Kernel Mailing List
In-Reply-To: <20121212.212734.917363230032045212.davem@davemloft.net>
On Wed, Dec 12, 2012 at 6:27 PM, David Miller <davem@davemloft.net> wrote:
>
> There are two SCTP HMAC cookie algorithms, MD5 and SHA1.
>
> What used to happen is that you had to choose one at build
> time, and then you were stuck with that decision and it was
> all that you could use.
>
> Now, it's selectable at run time.
>
> If there's anything you find particularly anti-social about
> this, I'm sure we can adjust it.
So I'd suggest doing the same thing that the new thermal throttling
Kconfig does: start off by asking for the default algorithm, then ask
about the others.
The "choice" part selects the one that is default (so it never gets
asked about and is obviously compiled in), and the rest default to no
like we should.
See drivers/thermal/Kconfig for an example of this. I think we do it
in other places too, but that one happens to be new so I picked it as
an example.
The rule should be that we *never* default anything to 'yes', unless
it's old functionality that we always compiled in before too, and now
it got made conditional. So if you see a "default y" on new options,
you should basically consider it broken.
We're already bloating too much, we should not encourage people to
make things more bloated than necessary.
Btw, that Kconfig option has basically no useful help text either.
What's the point of repeating the question as a "help" message?
If people can't explain why anybody should enable it, it sure as hell
shouldn't default to 'y'. Maybe it shouldn't exist at all?
Linus
^ permalink raw reply
* Re: [GIT] Networking
From: David Miller @ 2012-12-13 3:22 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel, nhorman, vyasevich
In-Reply-To: <CA+55aFxvHrNYB_J851XTkZ4EiwZ68Fb64DEU1JJmxPV-zB+9Vw@mail.gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 12 Dec 2012 18:37:08 -0800
> On Wed, Dec 12, 2012 at 6:27 PM, David Miller <davem@davemloft.net> wrote:
>>
>> There are two SCTP HMAC cookie algorithms, MD5 and SHA1.
>>
>> What used to happen is that you had to choose one at build
>> time, and then you were stuck with that decision and it was
>> all that you could use.
>>
>> Now, it's selectable at run time.
>>
>> If there's anything you find particularly anti-social about
>> this, I'm sure we can adjust it.
>
> So I'd suggest doing the same thing that the new thermal throttling
> Kconfig does: start off by asking for the default algorithm, then ask
> about the others.
>
> The "choice" part selects the one that is default (so it never gets
> asked about and is obviously compiled in), and the rest default to no
> like we should.
>
> See drivers/thermal/Kconfig for an example of this. I think we do it
> in other places too, but that one happens to be new so I picked it as
> an example.
>
> The rule should be that we *never* default anything to 'yes', unless
> it's old functionality that we always compiled in before too, and now
> it got made conditional. So if you see a "default y" on new options,
> you should basically consider it broken.
>
> We're already bloating too much, we should not encourage people to
> make things more bloated than necessary.
>
> Btw, that Kconfig option has basically no useful help text either.
> What's the point of repeating the question as a "help" message?
>
> If people can't explain why anybody should enable it, it sure as hell
> shouldn't default to 'y'. Maybe it shouldn't exist at all?
Neil and Vlad, please take care of this.
Thanks.
^ permalink raw reply
* Re: [Query] TCP TFO Query
From: Yuchung Cheng @ 2012-12-13 3:49 UTC (permalink / raw)
To: Ketan Kulkarni; +Cc: netdev
In-Reply-To: <CAD6NSj4dMG3OC0mb4Qiq2eXTNwFBonkcnw=gRF5YAef-5yjeVQ@mail.gmail.com>
On Wed, Dec 12, 2012 at 10:17 AM, Ketan Kulkarni <ketkulka@gmail.com> wrote:
> Thanks Yuchung for your reply.
>
> My only concern is -If syn+data is sent by client and syn-ack only acks the
> ISN, then isnt this a sufficient indication that server now is not
> supporting the TFO? So for further connections to this server, instead of
> sending syn+data, only ask for cookie. (fall back to the state where it was
> all started) (Note that this condition is different from syn+data is dropped
> in the nw.)
>
> I agree with you in saying it doesn't lead to any performance penalty,
> however sending syn+data to a server seems a little odd when we know we have
> sufficient information to believe that it may not be accepted at first,
> retransmitted later. And otherwise we also have a way to fall back and
> re-attempt the TFO.
Your proposal sounds reasonable. We can change that. In addition,
maybe we can change the server to send SYN-ACK acking ISN only with a
cookie option, if the server prefers the client to still do SYN-data-cookie
next time for some reason. I will try prepare a rfc patch soon.
>
> Thoughts?
>
> Thanks,
> Ketan
>
> On Dec 12, 2012 3:34 AM, "Yuchung Cheng" <ycheng@google.com> wrote:
>>
>> Hi Ketan,
>>
>> On Tue, Dec 11, 2012 at 9:29 AM, Ketan Kulkarni <ketkulka@gmail.com>
>> wrote:
>> > Hi,
>> > I am testing tcp tfo behavior with httping client and polipo server on
>> > 3.7rc-8
>> >
>> > One observation from my TFO testing -If for a connection server sends
>> > a cookie to client, client always does TFO for subsequent connections.
>> > This is ok.
>> >
>> > If for some reason, server stops supporting TFO (either because server
>> > got restarted without TFO support (in my case) or because path changed
>> > and the nw node is dropping packet with unknown syn option or
>> > stripping the option), client does not clear up its cookie cache. It
>> > always sends data in syn and server never acks the syn-data and client
>> > retransmits.
>> >
>> > As per kernel code -if syn-data is not acked it is retransmitted
>> > immediately - with the assumption first syn was dropped (but the
>> > assumption server stopped supporting TFO might not have been
>> > considered)
>> >
>> > Will it be better to flush the cookie for this server and re-attempt
>> > the cookie "negotiation" on subsequent connection than to retransmit
>> > the data every time?
>> >
>> > Your thoughts?
>>
>> In our initial design the client actually removes the cookie of the
>> particular server
>> (!= flush the entire cache though). Later on we changed to the current
>> behavior because
>> it does not have a performance penalty. It falls back to regular
>> handshake:
>>
>> SYN/cookie/data -> SYN-ACK acking ISN -> ACK(data).
>>
>> It may happen frequently when a large server farms are upgrading to
>> support TFO.
>>
>> However there are always more options:
>> 1) Server can selectively instrument to delete old cookies by sending a
>> SYN-ACK
>> acking initial sequence with a null TFO option (== caching a null
>> cookie ==
>> removing the older one).
> In the case I mentioned, this might not help because server got restarted
> with TFO disabled so having this option can help cases when server
> understands/supports tfo and know when to delete the client side cookie. Or
> may be I am missing something!!!
>
>> 2) another client-side flag in sysctl_tcp_fastopen to remove cookie if
>> SYN-ACK
>> only acks the syn sequence.
> My view is to prefer keeping knobs as minimum as possible as otherwise imo
> we might put extra efforts on the user to know and understand why and what
> this flag is when he is simply interested in TFO.
>
>> 3) combination of 1 and 2.
>>
>> More ideas are welcome :)
>>
>> NOTE: I've checked in a patch so that syn-data not acked is not treated as
>> a
>> network-drop.
>> http://patchwork.ozlabs.org/patch/171978/
>>
>> Yuchung
>>
>> >
>> > Thanks,
>> > Ketan
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe netdev" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* GPF in skb_flow_dissect
From: Dave Jones @ 2012-12-13 4:16 UTC (permalink / raw)
To: netdev
Since todays net merge, I see this when I start openvpn..
general protection fault: 0000 [#1] PREEMPT SMP
Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
CPU 0
Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14 /D975XBX
RIP: 0010:[<ffffffff815b54a4>] [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
RSP: 0018:ffff88007d0d9c48 EFLAGS: 00010206
RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
FS: 00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
Stack:
ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
Call Trace:
[<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
[<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
[<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
[<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
[<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
[<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
[<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
[<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
[<ffffffff811db917>] do_sync_write+0xa7/0xe0
[<ffffffff811dc01f>] vfs_write+0xaf/0x190
[<ffffffff811dc375>] sys_write+0x55/0xa0
[<ffffffff81705540>] tracesys+0xdd/0xe2
Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48
RIP [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
RSP <ffff88007d0d9c48>
---[ end trace 6d42c834c72c002e ]---
Faulting instruction is
0: 8b 03 mov (%rbx),%eax
rbx is slab poison (-20) so this looks like a use-after-free here...
flow->ports = *ports;
314: 8b 03 mov (%rbx),%eax
316: 41 89 46 08 mov %eax,0x8(%r14)
in the inlined skb_header_pointer in skb_flow_dissect
Dave
^ permalink raw reply
* remove noisy message from llcp_sock_sendmsg
From: Dave Jones @ 2012-12-13 4:11 UTC (permalink / raw)
To: netdev
This is easily triggerable when fuzz-testing as an unprivileged user.
We could rate-limit it, but given we don't print similar messages
for other protocols, I just removed it.
Signed-off-by: Dave Jones <davej@redhat.com>
diff --git a/net/nfc/llcp/sock.c b/net/nfc/llcp/sock.c
index 0fa1e92..fea22eb 100644
--- a/net/nfc/llcp/sock.c
+++ b/net/nfc/llcp/sock.c
@@ -614,10 +614,6 @@ static int llcp_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
if (msg->msg_namelen < sizeof(*addr)) {
release_sock(sk);
-
- pr_err("Invalid socket address length %d\n",
- msg->msg_namelen);
-
return -EINVAL;
}
^ permalink raw reply related
* Re: GPF in skb_flow_dissect
From: Eric Dumazet @ 2012-12-13 5:22 UTC (permalink / raw)
To: Dave Jones, Jason Wang, David Miller; +Cc: netdev
In-Reply-To: <20121213041644.GB1611@redhat.com>
From: Eric Dumazet <edumazet@google.com>
On Wed, 2012-12-12 at 23:16 -0500, Dave Jones wrote:
> Since todays net merge, I see this when I start openvpn..
>
> general protection fault: 0000 [#1] PREEMPT SMP
> Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
> CPU 0
> Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14 /D975XBX
> RIP: 0010:[<ffffffff815b54a4>] [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
> RSP: 0018:ffff88007d0d9c48 EFLAGS: 00010206
> RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
> RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
> RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
> R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
> FS: 00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
> Stack:
> ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
> ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
> ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
> Call Trace:
> [<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
> [<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
> [<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
> [<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
> [<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
> [<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
> [<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
> [<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
> [<ffffffff811db917>] do_sync_write+0xa7/0xe0
> [<ffffffff811dc01f>] vfs_write+0xaf/0x190
> [<ffffffff811dc375>] sys_write+0x55/0xa0
> [<ffffffff81705540>] tracesys+0xdd/0xe2
> Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48
> RIP [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
> RSP <ffff88007d0d9c48>
> ---[ end trace 6d42c834c72c002e ]---
>
>
> Faulting instruction is
>
> 0: 8b 03 mov (%rbx),%eax
>
> rbx is slab poison (-20) so this looks like a use-after-free here...
>
> flow->ports = *ports;
> 314: 8b 03 mov (%rbx),%eax
> 316: 41 89 46 08 mov %eax,0x8(%r14)
>
> in the inlined skb_header_pointer in skb_flow_dissect
>
> Dave
>
Yes, commit 7694a3acc55a7 added this bug
Its illegal to use skb after call to netif_rx_ni(skb);
I would try following patch.
Thanks !
[PATCH] tuntap: dont use skb after netif_rx_ni(skb)
commit 96442e4242 (tuntap: choose the txq based on rxq) added
a use after free.
Cache rxhash in a temp variable before calling netif_rx_ni()
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Wang <jasowang@redhat.com>
---
drivers/net/tun.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 2ac2164..40b426e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -297,13 +297,12 @@ static void tun_flow_cleanup(unsigned long data)
spin_unlock_bh(&tun->lock);
}
-static void tun_flow_update(struct tun_struct *tun, struct sk_buff *skb,
+static void tun_flow_update(struct tun_struct *tun, u32 rxhash,
u16 queue_index)
{
struct hlist_head *head;
struct tun_flow_entry *e;
unsigned long delay = tun->ageing_time;
- u32 rxhash = skb_get_rxhash(skb);
if (!rxhash)
return;
@@ -1010,6 +1009,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
int copylen;
bool zerocopy = false;
int err;
+ u32 rxhash;
if (!(tun->flags & TUN_NO_PI)) {
if ((len -= sizeof(pi)) > total_len)
@@ -1162,12 +1162,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
}
+ rxhash = skb_get_rxhash(skb);
netif_rx_ni(skb);
tun->dev->stats.rx_packets++;
tun->dev->stats.rx_bytes += len;
- tun_flow_update(tun, skb, tfile->queue_index);
+ tun_flow_update(tun, rxhash, tfile->queue_index);
return total_len;
}
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox