* Re: FOU RX interface?
From: Andy Lutomirski @ 2014-10-02 15:24 UTC (permalink / raw)
To: Tom Herbert; +Cc: Network Development, David S. Miller
In-Reply-To: <CA+mtBx_LphKn54F8sVbOqZBbbb9aGda4Kdc-pKhq4yxGk=6GvQ@mail.gmail.com>
On Thu, Oct 2, 2014 at 7:44 AM, Tom Herbert <therbert@google.com> wrote:
> On Wed, Oct 1, 2014 at 10:14 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> Hi-
>>
>> Sorry for the lack of proper threading here -- I lost the original message.
>>
>> If I'm understanding the FOU use case correctly, if I set up a FOU
>> tunnel tun0 that is encapsulated in UDP on eth0, then tun0 packets
>> will be transmitted on tun0, but incoming packets will show up on eth0
>> when they're reinjected after stripping the FOU header.
>>
> Incoming FOU packets will still land on the tunnel interface. In FOU
> RX the UDP packet is removed and logically re-injected into the
> stack-- at this point the packet is IPIP in IP (or sit, GRE) so
> appropriate tunnel protocol processing occurs.
>
Oh, right. That should have been obvious. Thanks for the clarification!
--Andy
>> Is this right? I think that, without a way to reinject the received
>> packets on the tunnel interface, using FOU will be annoying. For
>> example, writing firewall rules might be tricky. And programs that
>> use packet sockets or SO_BINDTODEVICE could have a hard time being
>> configured such that they notice the received packets.
>>
> I believe it should work.
>
>> Also, is it even possible to assign a FOU tunnel to a different
>> network namespace than the device that's being tunneled over? How
>> will the received packets end up in the right netns?
>>
> Anything you can do with IP tunnels, you should be able to with FOU
> enabled IP tunnels. FOU is transparent to IP tunnels on RX.
>
>> --Andy
--
Andy Lutomirski
AMA Capital Management, LLC
^ permalink raw reply
* [RFC PATCH linux 0/2] Optimize network interfaces creation
From: Nicolas Dichtel @ 2014-10-02 15:24 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: davem, ebiederm, akpm, adobriyan, rui.xiang, viro, oleg, gorcunov,
kirill.shutemov, grant.likely, tytso
In-Reply-To: <20131003.150947.2179820478039260398.davem@davemloft.net>
When a lot of netdevices are created, one of the bottleneck is the creation
of proc entries. This serie aims to accelerate this part.
The first patch only prepares the second one.
I'm not sure against which tree this patch should be done. I've done it against
linux.git.
fs/proc/generic.c | 100 +++++++++++++++++++++++++++++++++++++++++------------
fs/proc/internal.h | 49 +++++++++++++++++++++++---
fs/proc/proc_net.c | 8 +++++
fs/proc/root.c | 5 +++
4 files changed, 135 insertions(+), 27 deletions(-)
Comments are welcome.
Regards,
Nicolas
^ permalink raw reply
* [RFC PATCH linux 1/2] proc_net: declare /proc/net as a directory
From: Nicolas Dichtel @ 2014-10-02 15:25 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: davem, ebiederm, akpm, adobriyan, rui.xiang, viro, oleg, gorcunov,
kirill.shutemov, grant.likely, tytso, Thierry Herbelot,
Nicolas Dichtel
In-Reply-To: <1412263501-6572-1-git-send-email-nicolas.dichtel@6wind.com>
From: Thierry Herbelot <thierry.herbelot@6wind.com>
The mode bits are copied from those of "proc_root".
This commit prepares the next patch.
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
fs/proc/proc_net.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index a63af3e0a612..6fc308ec105c 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -193,6 +193,7 @@ static __net_init int proc_net_ns_init(struct net *net)
goto out;
netd->data = net;
+ netd->mode = S_IFDIR | S_IRUGO | S_IXUGO;
netd->nlink = 2;
netd->namelen = 3;
netd->parent = &proc_root;
--
2.1.0
^ permalink raw reply related
* [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries
From: Nicolas Dichtel @ 2014-10-02 15:25 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: davem, ebiederm, akpm, adobriyan, rui.xiang, viro, oleg, gorcunov,
kirill.shutemov, grant.likely, tytso, Thierry Herbelot,
Nicolas Dichtel
In-Reply-To: <1412263501-6572-1-git-send-email-nicolas.dichtel@6wind.com>
From: Thierry Herbelot <thierry.herbelot@6wind.com>
The current implementation for the directories in /proc is using a single
linked list. This is slow when handling directories with large numbers of
entries (eg netdevice-related entries when lots of tunnels are opened).
This patch enables multiple linked lists. A hash based on the entry name is
used to select the linked list for one given entry.
The speed creation of netdevices is faster as shorter linked lists must be
scanned when adding a new netdevice.
Here are some numbers:
dummy30000.batch contains 30 000 times 'link add type dummy'.
Before the patch:
time ip -b dummy30000.batch
real 2m32.221s
user 0m0.380s
sys 2m30.610s
After the patch:
time ip -b dummy30000.batch
real 1m57.190s
user 0m0.350s
sys 1m56.120s
The single 'subdir' list head is replaced by a subdir hash table. The subdir
hash buckets are only allocated for directories. The number of hash buckets
is a compile-time parameter.
For all functions which handle directory entries, an additional check on the
directory nature of the dir entry ensures that pde_hash_buckets was allocated.
This check was not needed as subdir was present for all dir entries, whether
actual directories or simple files.
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
fs/proc/generic.c | 100 +++++++++++++++++++++++++++++++++++++++++------------
fs/proc/internal.h | 49 +++++++++++++++++++++++---
fs/proc/proc_net.c | 7 ++++
fs/proc/root.c | 5 +++
4 files changed, 134 insertions(+), 27 deletions(-)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 317b72641ebf..c3af8c289c7e 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -81,10 +81,13 @@ static int __xlate_proc_name(const char *name, struct proc_dir_entry **ret,
const char *cp = name, *next;
struct proc_dir_entry *de;
unsigned int len;
+ unsigned int name_hash;
de = *ret;
if (!de)
de = &proc_root;
+ if (!S_ISDIR(de->mode))
+ return -EINVAL;
while (1) {
next = strchr(cp, '/');
@@ -92,7 +95,9 @@ static int __xlate_proc_name(const char *name, struct proc_dir_entry **ret,
break;
len = next - cp;
- for (de = de->subdir; de ; de = de->next) {
+ name_hash = proc_pde_name_hash(cp, len);
+ for (de = de->pde_hash_buckets[name_hash]; de;
+ de = de->next) {
if (proc_match(len, cp, de))
break;
}
@@ -180,10 +185,15 @@ static const struct inode_operations proc_link_inode_operations = {
struct dentry *proc_lookup_de(struct proc_dir_entry *de, struct inode *dir,
struct dentry *dentry)
{
+ unsigned int name_hash;
struct inode *inode;
+ if (!S_ISDIR(de->mode))
+ return NULL;
spin_lock(&proc_subdir_lock);
- for (de = de->subdir; de ; de = de->next) {
+ name_hash = proc_pde_name_hash(dentry->d_name.name,
+ dentry->d_name.len);
+ for (de = de->pde_hash_buckets[name_hash]; de ; de = de->next) {
if (de->namelen != dentry->d_name.len)
continue;
if (!memcmp(dentry->d_name.name, de->name, de->namelen)) {
@@ -219,18 +229,30 @@ struct dentry *proc_lookup(struct inode *dir, struct dentry *dentry,
int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
struct dir_context *ctx)
{
+ struct proc_dir_entry *dir;
+ unsigned int hash_idx = 0;
int i;
+ dir = de;
+ if (!S_ISDIR(dir->mode))
+ return -EINVAL;
if (!dir_emit_dots(file, ctx))
return 0;
spin_lock(&proc_subdir_lock);
- de = de->subdir;
+ /* try first hash bucket */
+ de = de->pde_hash_buckets[0];
+
i = ctx->pos - 2;
for (;;) {
if (!de) {
- spin_unlock(&proc_subdir_lock);
- return 0;
+ /* try next hash bucket if one is availalable */
+ hash_idx = find_next_hash_bucket(dir, hash_idx + 1);
+ if (hash_idx == PROC_PDE_HASH_BUCKETS) {
+ spin_unlock(&proc_subdir_lock);
+ return 0;
+ }
+ de = dir->pde_hash_buckets[hash_idx];
}
if (!i)
break;
@@ -250,6 +272,12 @@ int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
spin_lock(&proc_subdir_lock);
ctx->pos++;
next = de->next;
+ if (!next) {
+ /* try next hash bucket if one is availalable */
+ hash_idx = find_next_hash_bucket(dir, hash_idx + 1);
+ if (hash_idx != PROC_PDE_HASH_BUCKETS)
+ next = dir->pde_hash_buckets[hash_idx];
+ }
pde_put(de);
de = next;
} while (de);
@@ -287,8 +315,12 @@ static const struct inode_operations proc_dir_inode_operations = {
static int proc_register(struct proc_dir_entry * dir, struct proc_dir_entry * dp)
{
struct proc_dir_entry *tmp;
+ unsigned int name_hash;
int ret;
-
+
+ if (!S_ISDIR(dir->mode))
+ return -EINVAL;
+
ret = proc_alloc_inum(&dp->low_ino);
if (ret)
return ret;
@@ -309,16 +341,17 @@ static int proc_register(struct proc_dir_entry * dir, struct proc_dir_entry * dp
spin_lock(&proc_subdir_lock);
- for (tmp = dir->subdir; tmp; tmp = tmp->next)
+ name_hash = proc_pde_name_hash(dp->name, strlen(dp->name));
+ for (tmp = dir->pde_hash_buckets[name_hash]; tmp; tmp = tmp->next)
if (strcmp(tmp->name, dp->name) == 0) {
WARN(1, "proc_dir_entry '%s/%s' already registered\n",
dir->name, dp->name);
break;
}
- dp->next = dir->subdir;
+ dp->next = dir->pde_hash_buckets[name_hash];
dp->parent = dir;
- dir->subdir = dp;
+ dir->pde_hash_buckets[name_hash] = dp;
spin_unlock(&proc_subdir_lock);
return 0;
@@ -349,6 +382,14 @@ static struct proc_dir_entry *__proc_create(struct proc_dir_entry **parent,
ent = kzalloc(sizeof(struct proc_dir_entry) + qstr.len + 1, GFP_KERNEL);
if (!ent)
goto out;
+ if (S_ISDIR(mode)) {
+ ent->pde_hash_buckets = kzalloc(PROC_PDE_HASH_SIZE, GFP_KERNEL);
+ if (!ent->pde_hash_buckets) {
+ kfree(ent);
+ ent = NULL;
+ goto out;
+ }
+ }
memcpy(ent->name, fn, qstr.len + 1);
ent->namelen = qstr.len;
@@ -471,6 +512,8 @@ static void free_proc_entry(struct proc_dir_entry *de)
if (S_ISLNK(de->mode))
kfree(de->data);
+ if (S_ISDIR(de->mode))
+ kfree(de->pde_hash_buckets);
kfree(de);
}
@@ -488,7 +531,10 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
struct proc_dir_entry **p;
struct proc_dir_entry *de = NULL;
const char *fn = name;
- unsigned int len;
+ unsigned int len, name_hash;
+
+ if (!S_ISDIR(parent->mode))
+ return;
spin_lock(&proc_subdir_lock);
if (__xlate_proc_name(name, &parent, &fn) != 0) {
@@ -497,7 +543,8 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
}
len = strlen(fn);
- for (p = &parent->subdir; *p; p=&(*p)->next ) {
+ name_hash = proc_pde_name_hash(fn, len);
+ for (p = &parent->pde_hash_buckets[name_hash]; *p; p = &(*p)->next) {
if (proc_match(len, fn, *p)) {
de = *p;
*p = de->next;
@@ -516,9 +563,11 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
if (S_ISDIR(de->mode))
parent->nlink--;
de->nlink = 0;
- WARN(de->subdir, "%s: removing non-empty directory "
- "'%s/%s', leaking at least '%s'\n", __func__,
- de->parent->name, de->name, de->subdir->name);
+ if (S_ISDIR(de->mode))
+ WARN(de->pde_hash_buckets[name_hash],
+ "%s: removing non-empty directory '%s/%s', leaking at least '%s'\n",
+ __func__, de->parent->name, de->name,
+ de->pde_hash_buckets[name_hash]->name);
pde_put(de);
}
EXPORT_SYMBOL(remove_proc_entry);
@@ -528,7 +577,10 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
struct proc_dir_entry **p;
struct proc_dir_entry *root = NULL, *de, *next;
const char *fn = name;
- unsigned int len;
+ unsigned int len, name_hash, hash_idx;
+
+ if (!S_ISDIR(parent->mode))
+ return -EINVAL;
spin_lock(&proc_subdir_lock);
if (__xlate_proc_name(name, &parent, &fn) != 0) {
@@ -536,8 +588,9 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
return -ENOENT;
}
len = strlen(fn);
+ name_hash = proc_pde_name_hash(fn, len);
- for (p = &parent->subdir; *p; p=&(*p)->next ) {
+ for (p = &parent->pde_hash_buckets[name_hash]; *p; p = &(*p)->next) {
if (proc_match(len, fn, *p)) {
root = *p;
*p = root->next;
@@ -551,12 +604,15 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
}
de = root;
while (1) {
- next = de->subdir;
- if (next) {
- de->subdir = next->next;
- next->next = NULL;
- de = next;
- continue;
+ if (S_ISDIR(de->mode)) {
+ hash_idx = find_first_hash_bucket(de);
+ if (hash_idx < PROC_PDE_HASH_BUCKETS) {
+ next = de->pde_hash_buckets[hash_idx];
+ de->pde_hash_buckets[hash_idx] = next->next;
+ next->next = NULL;
+ de = next;
+ continue;
+ }
}
spin_unlock(&proc_subdir_lock);
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7da13e49128a..7c32e7821453 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -18,16 +18,28 @@
struct ctl_table_header;
struct mempolicy;
+#define PROC_PDE_SHIFT 6
+#define PROC_PDE_HASH_BUCKETS (1 << PROC_PDE_SHIFT)
+#define PROC_PDE_HASH_MASK (PROC_PDE_HASH_BUCKETS - 1)
+#define PROC_PDE_HASH_SIZE (PROC_PDE_HASH_BUCKETS * \
+ sizeof(struct proc_dir_entry *))
+
+static inline unsigned int proc_pde_name_hash(const unsigned char *name,
+ const unsigned int len)
+{
+ return full_name_hash(name, len) & PROC_PDE_HASH_MASK;
+}
+
/*
* This is not completely implemented yet. The idea is to
* create an in-memory tree (like the actual /proc filesystem
* tree) of these proc_dir_entries, so that we can dynamically
* add new files to /proc.
*
- * The "next" pointer creates a linked list of one /proc directory,
- * while parent/subdir create the directory structure (every
- * /proc file has a parent, but "subdir" is NULL for all
- * non-directory entries).
+ * The "pde_hash_buckets" pointer is a hash table of linked lists for
+ * one /proc directory (every /proc file has a parent, but it is NULL
+ * for all non-directory entries). The linked lists are implemented using
+ * the "next" fields of the proc_dir_entry.
*/
struct proc_dir_entry {
unsigned int low_ino;
@@ -38,7 +50,7 @@ struct proc_dir_entry {
loff_t size;
const struct inode_operations *proc_iops;
const struct file_operations *proc_fops;
- struct proc_dir_entry *next, *parent, *subdir;
+ struct proc_dir_entry *next, *parent;
void *data;
atomic_t count; /* use count */
atomic_t in_use; /* number of callers into module in progress; */
@@ -46,10 +58,37 @@ struct proc_dir_entry {
struct completion *pde_unload_completion;
struct list_head pde_openers; /* who did ->open, but not ->release */
spinlock_t pde_unload_lock; /* proc_fops checks and pde_users bumps */
+ struct proc_dir_entry **pde_hash_buckets; /* hash table for dirs */
u8 namelen;
char name[];
};
+/* index for the next non-NULL bucket in the hash table */
+static inline unsigned int find_next_hash_bucket(const struct proc_dir_entry *pde,
+ const unsigned int start)
+{
+ unsigned int i;
+
+ if (start >= PROC_PDE_HASH_BUCKETS)
+ return PROC_PDE_HASH_BUCKETS;
+
+ for (i = start;
+ (i < PROC_PDE_HASH_BUCKETS) && (!pde->pde_hash_buckets[i]);
+ i++)
+ ;
+
+ if (!pde->pde_hash_buckets[i])
+ i = PROC_PDE_HASH_BUCKETS;
+
+ return i;
+}
+
+/* index for the first non-NULL bucket in the hash table */
+static inline int find_first_hash_bucket(const struct proc_dir_entry *pde)
+{
+ return find_next_hash_bucket(pde, 0);
+}
+
union proc_op {
int (*proc_get_link)(struct dentry *, struct path *);
int (*proc_show)(struct seq_file *m,
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 6fc308ec105c..da2e9f4533ed 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -191,6 +191,12 @@ static __net_init int proc_net_ns_init(struct net *net)
netd = kzalloc(sizeof(*netd) + 4, GFP_KERNEL);
if (!netd)
goto out;
+ netd->pde_hash_buckets = kzalloc(PROC_PDE_HASH_SIZE, GFP_KERNEL);
+ if (!netd->pde_hash_buckets) {
+ kfree(netd);
+ netd = NULL;
+ goto out;
+ }
netd->data = net;
netd->mode = S_IFDIR | S_IRUGO | S_IXUGO;
@@ -217,6 +223,7 @@ out:
static __net_exit void proc_net_ns_exit(struct net *net)
{
remove_proc_entry("stat", net->proc_net);
+ kfree(net->proc_net->pde_hash_buckets);
kfree(net->proc_net);
}
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 094e44d4a6be..bcd830465871 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -20,6 +20,7 @@
#include <linux/mount.h>
#include <linux/pid_namespace.h>
#include <linux/parser.h>
+#include <linux/slab.h>
#include "internal.h"
@@ -166,6 +167,10 @@ void __init proc_root_init(void)
{
int err;
+ proc_root.pde_hash_buckets = kzalloc(PROC_PDE_HASH_SIZE, GFP_KERNEL);
+ if (!proc_root.pde_hash_buckets)
+ return;
+
proc_init_inodecache();
err = register_filesystem(&proc_fs_type);
if (err)
--
2.1.0
^ permalink raw reply related
* Re: [net-next PATCH V6 0/2] qdisc: bulk dequeue support
From: Tom Herbert @ 2014-10-02 15:27 UTC (permalink / raw)
To: Eric Dumazet
Cc: Amir Vadai, Jesper Dangaard Brouer, Linux Netdev List,
David S. Miller, Hannes Frederic Sowa, Florian Westphal,
Daniel Borkmann, Jamal Hadi Salim, Alexander Duyck,
John Fastabend, Dave Taht, Toke Høiland-Jørgensen
In-Reply-To: <1412262270.16704.103.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, Oct 2, 2014 at 8:04 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2014-10-02 at 07:42 -0700, Tom Herbert wrote:
>
>
>> Unfortunately we probably still need something. If BQL were disabled
>> (by setting BQL min_limit to infinity) then we'll always dequeue all
>> the packets in the qdisc. Disabling BQL might be legitimate in
>> deployment if say a bug is found in a device that prevents prompt
>> transmit completions for some corner case.
>
>
> Unfortunately, there is nothing we can do if a ndo_start_xmit() function
> is buggy.
>
I was referring more to device bugs not driver bugs. But even so, if a
user hits a problem like this in the system, disabling something like
BQL would probably be a better recourse than bringing their network
down.
Minimally we should at least test with BQL disabled, it might not be
so horrible. I suppose in the worse case we're at least limited by the
qdisc limit.
> For example, if a prior packet is queued on NIC TX ring (with xmit_more
> set to 1)
>
> Then following packet given to ndo_start_xmit() returns infamous
> NETDEV_TX_BUSY, we are screwed.
>
> mlx4 driver for example has this bug.
>
>
> Thats why I advocate we have no artifical limit : We are going to catch
> driver bugs sooner and fix them before xmit_more support is validated,
> for each driver that claims to support it.
>
> Amir, the problem with NETDEV_TX_BUSY in mlx4 need to be fixed.
>
>
^ permalink raw reply
* Re: netlink NETLINK_ROUTE failure & Can the kernel really handle IPv6 properly
From: Dan Williams @ 2014-10-02 15:44 UTC (permalink / raw)
To: Ulf samuelsson; +Cc: Hannes Frederic Sowa, Linux Netdev List
In-Reply-To: <9330C415-574C-492B-A7AE-92EBCF6D1A26@emagii.com>
On Thu, 2014-10-02 at 14:38 +0200, Ulf samuelsson wrote:
> This is the significant code, and it is wrong.
>
> static struct notifier_block my_ipv6_address_notifier =
> {
> my_ipv6_address_notifier_cb,
> NULL,
> 0
> };
>
> register_inet6addr_notifier (&my_ipv6_address_notifier );
>
> int
> my_ipv6_address_notifier_cb (struct notifier_block *self,
> unsigned long event, void *val)
> {
> struct inet6_ifaddr *ifaddr = (struct inet6_ifaddr *)val;
>
>
> /* We are only interested in address add/delete events */
> /* IPv6 address add comes as NETDEV_UP and delete comes as
> * NETDEV_DOWN
> */
> if ((event != NETDEV_UP) && (event != NETDEV_DOWN))
> return ret;
>
> if (ifaddr == NULL)
> return ret;
> /* Now that we are sure that it is a IPv6 address being added deleted,
> * verify that it is a link local address.
> */
> if (!IPV6_IS_ADDR_LINKLOCAL (&ifaddr->addr))
> {
> return ret;
> }
> ...
> send_message_to_app(LINK_LOCAL_UP, ip);
> ...
> return ret;
> }
>
>
> Application tries to send message to "ip" and fails, because the link-local adress is still
> in "tentative state"
It seems to me that a better architecture would be to have the app
itself listen for RTM_NEWADDR netlink event and look for lack of
IFA_F_TENTATIVE in the IFA_FLAGS attribute. Using a kernel module to do
the same thing seems pretty wrong.
Dan
> Best Regards
> Ulf Samuelsson
> ulf@emagii.com
> +46 (722) 427 437
>
>
> > 1 okt 2014 kl. 23:30 skrev Hannes Frederic Sowa <hannes@stressinduktion.org>:
> >
> > Hello,
> >
> >> On Wed, Oct 1, 2014, at 22:28, Ulf Samuelsson wrote:
> >> BTW, the problem I am trying to solve is how to connect to an I/F with
> >> an IPv6 link-local address.
> >>
> >> An existing kernel module waits for a NETDEV_UP event, and then tries to
> >> communicate
> >> with the link-local address.
> >>
> >> This will fail, because (according to a colleague) the I/F enters a
> >> "tentative" state,
> >> where it is trying to decide if it is unique or not.
> >> It will remain in that state for 1-2 seconds, and only afterwards is the
> >> link-local address
> >> available for normal use.
> >>
> >> The guys writing the module, claim that the kernel is using NETDEV_UP.
> >> There is very little code in the kernel using NETLINK_ROUTE, even in
> >> latest stable.
> >> It is using NETDEV_UP.
> >>
> >> If my colleague is right, the kernel really cannot handle IPv6
> >> link-local addresses properly.
> >
> > Sorry, I cannot really follow you, can you send example code or be a bit
> > more precise?
> >
> > Thanks,
> > Hannes
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] ipvs: Clean up comment style in ip_vs.h
From: Pablo Neira Ayuso @ 2014-10-02 16:23 UTC (permalink / raw)
To: Simon Horman
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Sergei Shtylyov, David Miller
In-Reply-To: <1412041806-9480-1-git-send-email-horms@verge.net.au>
[-- Attachment #1: Type: text/plain, Size: 263 bytes --]
Hi Simon,
I made a second pass to your original patch. Mostly some leftovers and
some missing line breaks that I added myself.
Let me know if you have any objection with this, so I'll include this
in the next batch that I'll send to David by tomorrow.
Thanks.
[-- Attachment #2: 0001-ipvs-Clean-up-comment-style-in-ip_vs.h.patch --]
[-- Type: text/x-diff, Size: 18040 bytes --]
>From 87e9ac7144d529e6fd58dad1e222842b8de5ad8d Mon Sep 17 00:00:00 2001
From: Simon Horman <horms@verge.net.au>
Date: Tue, 30 Sep 2014 10:50:06 +0900
Subject: [PATCH] ipvs: Clean up comment style in ip_vs.h
* Consistently use the multi-line comment style for networking code:
/* This
* That
* The other thing
*/
* Use single-line comment style for comments with only one line of text.
* In general follow the leading '*' of each line of a comment with a
single space and then text.
* Add missing line break between functions, remove double line break,
align comments to previous lines whenever possible.
Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/ip_vs.h | 214 ++++++++++++++++++---------------------------------
1 file changed, 75 insertions(+), 139 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 576d7f0..615b20b 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1,6 +1,5 @@
-/*
- * IP Virtual Server
- * data structure and functionality definitions
+/* IP Virtual Server
+ * data structure and functionality definitions
*/
#ifndef _NET_IP_VS_H
@@ -12,7 +11,7 @@
#include <linux/list.h> /* for struct list_head */
#include <linux/spinlock.h> /* for struct rwlock_t */
-#include <linux/atomic.h> /* for struct atomic_t */
+#include <linux/atomic.h> /* for struct atomic_t */
#include <linux/compiler.h>
#include <linux/timer.h>
#include <linux/bug.h>
@@ -30,15 +29,13 @@
#endif
#include <net/net_namespace.h> /* Netw namespace */
-/*
- * Generic access of ipvs struct
- */
+/* Generic access of ipvs struct */
static inline struct netns_ipvs *net_ipvs(struct net* net)
{
return net->ipvs;
}
-/*
- * Get net ptr from skb in traffic cases
+
+/* Get net ptr from skb in traffic cases
* use skb_sknet when call is from userland (ioctl or netlink)
*/
static inline struct net *skb_net(const struct sk_buff *skb)
@@ -90,8 +87,8 @@ static inline struct net *skb_sknet(const struct sk_buff *skb)
return &init_net;
#endif
}
-/*
- * This one needed for single_open_net since net is stored directly in
+
+/* This one needed for single_open_net since net is stored directly in
* private not as a struct i.e. seq_file_net can't be used.
*/
static inline struct net *seq_file_single_net(struct seq_file *seq)
@@ -108,7 +105,7 @@ extern int ip_vs_conn_tab_size;
struct ip_vs_iphdr {
__u32 len; /* IPv4 simply where L4 starts
- IPv6 where L4 Transport Header starts */
+ * IPv6 where L4 Transport Header starts */
__u16 fragoffs; /* IPv6 fragment offset, 0 if first frag (or not frag)*/
__s16 protocol;
__s32 flags;
@@ -304,16 +301,11 @@ static inline const char *ip_vs_dbg_addr(int af, char *buf, size_t buf_len,
#define LeaveFunction(level) do {} while (0)
#endif
-
-/*
- * The port number of FTP service (in network order).
- */
+/* The port number of FTP service (in network order). */
#define FTPPORT cpu_to_be16(21)
#define FTPDATA cpu_to_be16(20)
-/*
- * TCP State Values
- */
+/* TCP State Values */
enum {
IP_VS_TCP_S_NONE = 0,
IP_VS_TCP_S_ESTABLISHED,
@@ -329,25 +321,19 @@ enum {
IP_VS_TCP_S_LAST
};
-/*
- * UDP State Values
- */
+/* UDP State Values */
enum {
IP_VS_UDP_S_NORMAL,
IP_VS_UDP_S_LAST,
};
-/*
- * ICMP State Values
- */
+/* ICMP State Values */
enum {
IP_VS_ICMP_S_NORMAL,
IP_VS_ICMP_S_LAST,
};
-/*
- * SCTP State Values
- */
+/* SCTP State Values */
enum ip_vs_sctp_states {
IP_VS_SCTP_S_NONE,
IP_VS_SCTP_S_INIT1,
@@ -366,21 +352,18 @@ enum ip_vs_sctp_states {
IP_VS_SCTP_S_LAST
};
-/*
- * Delta sequence info structure
- * Each ip_vs_conn has 2 (output AND input seq. changes).
- * Only used in the VS/NAT.
+/* Delta sequence info structure
+ * Each ip_vs_conn has 2 (output AND input seq. changes).
+ * Only used in the VS/NAT.
*/
struct ip_vs_seq {
__u32 init_seq; /* Add delta from this seq */
__u32 delta; /* Delta in sequence numbers */
__u32 previous_delta; /* Delta in sequence numbers
- before last resized pkt */
+ * before last resized pkt */
};
-/*
- * counters per cpu
- */
+/* counters per cpu */
struct ip_vs_counters {
__u32 conns; /* connections scheduled */
__u32 inpkts; /* incoming packets */
@@ -388,17 +371,13 @@ struct ip_vs_counters {
__u64 inbytes; /* incoming bytes */
__u64 outbytes; /* outgoing bytes */
};
-/*
- * Stats per cpu
- */
+/* Stats per cpu */
struct ip_vs_cpu_stats {
struct ip_vs_counters ustats;
struct u64_stats_sync syncp;
};
-/*
- * IPVS statistics objects
- */
+/* IPVS statistics objects */
struct ip_vs_estimator {
struct list_head list;
@@ -491,9 +470,7 @@ struct ip_vs_protocol {
void (*timeout_change)(struct ip_vs_proto_data *pd, int flags);
};
-/*
- * protocol data per netns
- */
+/* protocol data per netns */
struct ip_vs_proto_data {
struct ip_vs_proto_data *next;
struct ip_vs_protocol *pp;
@@ -520,9 +497,7 @@ struct ip_vs_conn_param {
__u8 pe_data_len;
};
-/*
- * IP_VS structure allocated for each dynamically scheduled connection
- */
+/* IP_VS structure allocated for each dynamically scheduled connection */
struct ip_vs_conn {
struct hlist_node c_list; /* hashed list heads */
/* Protocol, addresses and port numbers */
@@ -561,17 +536,18 @@ struct ip_vs_conn {
struct ip_vs_dest *dest; /* real server */
atomic_t in_pkts; /* incoming packet counter */
- /* packet transmitter for different forwarding methods. If it
- mangles the packet, it must return NF_DROP or better NF_STOLEN,
- otherwise this must be changed to a sk_buff **.
- NF_ACCEPT can be returned when destination is local.
+ /* Packet transmitter for different forwarding methods. If it
+ * mangles the packet, it must return NF_DROP or better NF_STOLEN,
+ * otherwise this must be changed to a sk_buff **.
+ * NF_ACCEPT can be returned when destination is local.
*/
int (*packet_xmit)(struct sk_buff *skb, struct ip_vs_conn *cp,
struct ip_vs_protocol *pp, struct ip_vs_iphdr *iph);
/* Note: we can group the following members into a structure,
- in order to save more space, and the following members are
- only used in VS/NAT anyway */
+ * in order to save more space, and the following members are
+ * only used in VS/NAT anyway
+ */
struct ip_vs_app *app; /* bound ip_vs_app object */
void *app_data; /* Application private data */
struct ip_vs_seq in_seq; /* incoming seq. struct */
@@ -584,9 +560,7 @@ struct ip_vs_conn {
struct rcu_head rcu_head;
};
-/*
- * To save some memory in conn table when name space is disabled.
- */
+/* To save some memory in conn table when name space is disabled. */
static inline struct net *ip_vs_conn_net(const struct ip_vs_conn *cp)
{
#ifdef CONFIG_NET_NS
@@ -595,6 +569,7 @@ static inline struct net *ip_vs_conn_net(const struct ip_vs_conn *cp)
return &init_net;
#endif
}
+
static inline void ip_vs_conn_net_set(struct ip_vs_conn *cp, struct net *net)
{
#ifdef CONFIG_NET_NS
@@ -612,13 +587,12 @@ static inline int ip_vs_conn_net_eq(const struct ip_vs_conn *cp,
#endif
}
-/*
- * Extended internal versions of struct ip_vs_service_user and
- * ip_vs_dest_user for IPv6 support.
+/* Extended internal versions of struct ip_vs_service_user and ip_vs_dest_user
+ * for IPv6 support.
*
- * We need these to conveniently pass around service and destination
- * options, but unfortunately, we also need to keep the old definitions to
- * maintain userspace backwards compatibility for the setsockopt interface.
+ * We need these to conveniently pass around service and destination
+ * options, but unfortunately, we also need to keep the old definitions to
+ * maintain userspace backwards compatibility for the setsockopt interface.
*/
struct ip_vs_service_user_kern {
/* virtual service addresses */
@@ -656,8 +630,8 @@ struct ip_vs_dest_user_kern {
/*
- * The information about the virtual service offered to the net
- * and the forwarding entries
+ * The information about the virtual service offered to the net and the
+ * forwarding entries.
*/
struct ip_vs_service {
struct hlist_node s_list; /* for normal service table */
@@ -697,9 +671,8 @@ struct ip_vs_dest_dst {
struct rcu_head rcu_head;
};
-/*
- * The real server destination forwarding entry
- * with ip address, port number, and so on.
+/* The real server destination forwarding entry with ip address, port number,
+ * and so on.
*/
struct ip_vs_dest {
struct list_head n_list; /* for the dests in the service */
@@ -738,10 +711,7 @@ struct ip_vs_dest {
unsigned int in_rs_table:1; /* we are in rs_table */
};
-
-/*
- * The scheduler object
- */
+/* The scheduler object */
struct ip_vs_scheduler {
struct list_head n_list; /* d-linked list head */
char *name; /* scheduler name */
@@ -781,9 +751,7 @@ struct ip_vs_pe {
int (*show_pe_data)(const struct ip_vs_conn *cp, char *buf);
};
-/*
- * The application module object (a.k.a. app incarnation)
- */
+/* The application module object (a.k.a. app incarnation) */
struct ip_vs_app {
struct list_head a_list; /* member in app list */
int type; /* IP_VS_APP_TYPE_xxx */
@@ -799,16 +767,14 @@ struct ip_vs_app {
atomic_t usecnt; /* usage counter */
struct rcu_head rcu_head;
- /*
- * output hook: Process packet in inout direction, diff set for TCP.
+ /* output hook: Process packet in inout direction, diff set for TCP.
* Return: 0=Error, 1=Payload Not Mangled/Mangled but checksum is ok,
* 2=Mangled but checksum was not updated
*/
int (*pkt_out)(struct ip_vs_app *, struct ip_vs_conn *,
struct sk_buff *, int *diff);
- /*
- * input hook: Process packet in outin direction, diff set for TCP.
+ /* input hook: Process packet in outin direction, diff set for TCP.
* Return: 0=Error, 1=Payload Not Mangled/Mangled but checksum is ok,
* 2=Mangled but checksum was not updated
*/
@@ -867,9 +833,7 @@ struct ipvs_master_sync_state {
struct netns_ipvs {
int gen; /* Generation */
int enable; /* enable like nf_hooks do */
- /*
- * Hash table: for real service lookups
- */
+ /* Hash table: for real service lookups */
#define IP_VS_RTAB_BITS 4
#define IP_VS_RTAB_SIZE (1 << IP_VS_RTAB_BITS)
#define IP_VS_RTAB_MASK (IP_VS_RTAB_SIZE - 1)
@@ -903,7 +867,7 @@ struct netns_ipvs {
struct list_head sctp_apps[SCTP_APP_TAB_SIZE];
#endif
/* ip_vs_conn */
- atomic_t conn_count; /* connection counter */
+ atomic_t conn_count; /* connection counter */
/* ip_vs_ctl */
struct ip_vs_stats tot_stats; /* Statistics & est. */
@@ -990,9 +954,9 @@ struct netns_ipvs {
char backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
/* net name space ptr */
struct net *net; /* Needed by timer routines */
- /* Number of heterogeneous destinations, needed because
- * heterogeneous are not supported when synchronization is
- * enabled */
+ /* Number of heterogeneous destinations, needed becaus heterogeneous
+ * are not supported when synchronization is enabled.
+ */
unsigned int mixed_address_family_dests;
};
@@ -1147,9 +1111,8 @@ static inline int sysctl_backup_only(struct netns_ipvs *ipvs)
#endif
-/*
- * IPVS core functions
- * (from ip_vs_core.c)
+/* IPVS core functions
+ * (from ip_vs_core.c)
*/
const char *ip_vs_proto_name(unsigned int proto);
void ip_vs_init_hash_table(struct list_head *table, int rows);
@@ -1157,11 +1120,9 @@ void ip_vs_init_hash_table(struct list_head *table, int rows);
#define IP_VS_APP_TYPE_FTP 1
-/*
- * ip_vs_conn handling functions
- * (from ip_vs_conn.c)
+/* ip_vs_conn handling functions
+ * (from ip_vs_conn.c)
*/
-
enum {
IP_VS_DIR_INPUT = 0,
IP_VS_DIR_OUTPUT,
@@ -1292,9 +1253,7 @@ ip_vs_control_add(struct ip_vs_conn *cp, struct ip_vs_conn *ctl_cp)
atomic_inc(&ctl_cp->n_control);
}
-/*
- * IPVS netns init & cleanup functions
- */
+/* IPVS netns init & cleanup functions */
int ip_vs_estimator_net_init(struct net *net);
int ip_vs_control_net_init(struct net *net);
int ip_vs_protocol_net_init(struct net *net);
@@ -1309,9 +1268,8 @@ void ip_vs_estimator_net_cleanup(struct net *net);
void ip_vs_sync_net_cleanup(struct net *net);
void ip_vs_service_net_cleanup(struct net *net);
-/*
- * IPVS application functions
- * (from ip_vs_app.c)
+/* IPVS application functions
+ * (from ip_vs_app.c)
*/
#define IP_VS_APP_MAX_PORTS 8
struct ip_vs_app *register_ip_vs_app(struct net *net, struct ip_vs_app *app);
@@ -1331,9 +1289,7 @@ int unregister_ip_vs_pe(struct ip_vs_pe *pe);
struct ip_vs_pe *ip_vs_pe_getbyname(const char *name);
struct ip_vs_pe *__ip_vs_pe_getbyname(const char *pe_name);
-/*
- * Use a #define to avoid all of module.h just for these trivial ops
- */
+/* Use a #define to avoid all of module.h just for these trivial ops */
#define ip_vs_pe_get(pe) \
if (pe && pe->module) \
__module_get(pe->module);
@@ -1342,9 +1298,7 @@ struct ip_vs_pe *__ip_vs_pe_getbyname(const char *pe_name);
if (pe && pe->module) \
module_put(pe->module);
-/*
- * IPVS protocol functions (from ip_vs_proto.c)
- */
+/* IPVS protocol functions (from ip_vs_proto.c) */
int ip_vs_protocol_init(void);
void ip_vs_protocol_cleanup(void);
void ip_vs_protocol_timeout_change(struct netns_ipvs *ipvs, int flags);
@@ -1362,9 +1316,8 @@ extern struct ip_vs_protocol ip_vs_protocol_esp;
extern struct ip_vs_protocol ip_vs_protocol_ah;
extern struct ip_vs_protocol ip_vs_protocol_sctp;
-/*
- * Registering/unregistering scheduler functions
- * (from ip_vs_sched.c)
+/* Registering/unregistering scheduler functions
+ * (from ip_vs_sched.c)
*/
int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
@@ -1383,10 +1336,7 @@ int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
void ip_vs_scheduler_err(struct ip_vs_service *svc, const char *msg);
-
-/*
- * IPVS control data and functions (from ip_vs_ctl.c)
- */
+/* IPVS control data and functions (from ip_vs_ctl.c) */
extern struct ip_vs_stats ip_vs_stats;
extern int sysctl_ip_vs_sync_ver;
@@ -1427,26 +1377,21 @@ static inline void ip_vs_dest_put_and_free(struct ip_vs_dest *dest)
kfree(dest);
}
-/*
- * IPVS sync daemon data and function prototypes
- * (from ip_vs_sync.c)
+/* IPVS sync daemon data and function prototypes
+ * (from ip_vs_sync.c)
*/
int start_sync_thread(struct net *net, int state, char *mcast_ifn, __u8 syncid);
int stop_sync_thread(struct net *net, int state);
void ip_vs_sync_conn(struct net *net, struct ip_vs_conn *cp, int pkts);
-/*
- * IPVS rate estimator prototypes (from ip_vs_est.c)
- */
+/* IPVS rate estimator prototypes (from ip_vs_est.c) */
void ip_vs_start_estimator(struct net *net, struct ip_vs_stats *stats);
void ip_vs_stop_estimator(struct net *net, struct ip_vs_stats *stats);
void ip_vs_zero_estimator(struct ip_vs_stats *stats);
void ip_vs_read_estimator(struct ip_vs_stats_user *dst,
struct ip_vs_stats *stats);
-/*
- * Various IPVS packet transmitters (from ip_vs_xmit.c)
- */
+/* Various IPVS packet transmitters (from ip_vs_xmit.c) */
int ip_vs_null_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
struct ip_vs_protocol *pp, struct ip_vs_iphdr *iph);
int ip_vs_bypass_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
@@ -1477,12 +1422,10 @@ int ip_vs_icmp_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
#endif
#ifdef CONFIG_SYSCTL
-/*
- * This is a simple mechanism to ignore packets when
- * we are loaded. Just set ip_vs_drop_rate to 'n' and
- * we start to drop 1/rate of the packets
+/* This is a simple mechanism to ignore packets when
+ * we are loaded. Just set ip_vs_drop_rate to 'n' and
+ * we start to drop 1/rate of the packets
*/
-
static inline int ip_vs_todrop(struct netns_ipvs *ipvs)
{
if (!ipvs->drop_rate)
@@ -1496,9 +1439,7 @@ static inline int ip_vs_todrop(struct netns_ipvs *ipvs)
static inline int ip_vs_todrop(struct netns_ipvs *ipvs) { return 0; }
#endif
-/*
- * ip_vs_fwd_tag returns the forwarding tag of the connection
- */
+/* ip_vs_fwd_tag returns the forwarding tag of the connection */
#define IP_VS_FWD_METHOD(cp) (cp->flags & IP_VS_CONN_F_FWD_MASK)
static inline char ip_vs_fwd_tag(struct ip_vs_conn *cp)
@@ -1557,9 +1498,7 @@ static inline __wsum ip_vs_check_diff2(__be16 old, __be16 new, __wsum oldsum)
return csum_partial(diff, sizeof(diff), oldsum);
}
-/*
- * Forget current conntrack (unconfirmed) and attach notrack entry
- */
+/* Forget current conntrack (unconfirmed) and attach notrack entry */
static inline void ip_vs_notrack(struct sk_buff *skb)
{
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
@@ -1576,9 +1515,8 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
}
#ifdef CONFIG_IP_VS_NFCT
-/*
- * Netfilter connection tracking
- * (from ip_vs_nfct.c)
+/* Netfilter connection tracking
+ * (from ip_vs_nfct.c)
*/
static inline int ip_vs_conntrack_enabled(struct netns_ipvs *ipvs)
{
@@ -1617,14 +1555,12 @@ static inline int ip_vs_confirm_conntrack(struct sk_buff *skb)
static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
{
}
-/* CONFIG_IP_VS_NFCT */
-#endif
+#endif /* CONFIG_IP_VS_NFCT */
static inline int
ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
{
- /*
- * We think the overhead of processing active connections is 256
+ /* We think the overhead of processing active connections is 256
* times higher than that of inactive connections in average. (This
* 256 times might not be accurate, we will change it later) We
* use the following formula to estimate the overhead now:
--
1.7.10.4
^ permalink raw reply related
* [PATCH net] ip6_gre: fix flowi6_proto value in xmit path
From: Nicolas Dichtel @ 2014-10-02 16:26 UTC (permalink / raw)
To: davem; +Cc: netdev, Nicolas Dichtel
In xmit path, we build a flowi6 which will be used for the output route lookup.
We are sending a GRE packet, neither IPv4 nor IPv6 encapsulated packet, thus the
protocol should be IPPROTO_GRE.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reported-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
net/ipv6/ip6_gre.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index f304471477dc..97299d76c1b0 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -782,7 +782,7 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
encap_limit = t->parms.encap_limit;
memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
- fl6.flowi6_proto = IPPROTO_IPIP;
+ fl6.flowi6_proto = IPPROTO_GRE;
dsfield = ipv4_get_dsfield(iph);
@@ -832,7 +832,7 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
encap_limit = t->parms.encap_limit;
memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
- fl6.flowi6_proto = IPPROTO_IPV6;
+ fl6.flowi6_proto = IPPROTO_GRE;
dsfield = ipv6_get_dsfield(ipv6h);
if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
--
2.1.0
^ permalink raw reply related
* [PATCH net-next] net: better IFF_XMIT_DST_RELEASE support
From: Eric Dumazet @ 2014-10-02 16:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev
From: Eric Dumazet <edumazet@google.com>
Testing xmit_more support with netperf and connected UDP sockets,
I found strange dst refcount false sharing.
Current handling of IFF_XMIT_DST_RELEASE is not optimal.
dropping dst in validate_xmit_skb() is certainly too late in case
packet was queued by cpu X but dequeued by cpu Y
The logical point to take care of drop/force is in __dev_queue_xmit()
before even taking qdisc lock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/netdevice.h | 3 +--
net/core/dev.c | 16 ++++++++--------
2 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9b7fbacb6296..ea8b23510c9e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1168,8 +1168,7 @@ struct net_device_ops {
* @IFF_ISATAP: ISATAP interface (RFC4214)
* @IFF_MASTER_ARPMON: bonding master, ARP mon in use
* @IFF_WAN_HDLC: WAN HDLC device
- * @IFF_XMIT_DST_RELEASE: dev_hard_start_xmit() is allowed to
- * release skb->dst
+ * @IFF_XMIT_DST_RELEASE: dev_queue_xmit() is allowed to release skb->dst
* @IFF_DONT_BRIDGE: disallow bridging this ether dev
* @IFF_DISABLE_NETPOLL: disable netpoll at run-time
* @IFF_MACVLAN_PORT: device used as macvlan port
diff --git a/net/core/dev.c b/net/core/dev.c
index e55c546717d4..e178b16b2e53 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2662,11 +2662,6 @@ struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
if (skb->next)
return skb;
- /* If device doesn't need skb->dst, release it right now while
- * its hot in this cpu cache
- */
- if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
- skb_dst_drop(skb);
features = netif_skb_features(skb);
skb = validate_xmit_vlan(skb, features);
@@ -2781,8 +2776,6 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
* waiting to be sent out; and the qdisc is not running -
* xmit the skb directly.
*/
- if (!(dev->priv_flags & IFF_XMIT_DST_RELEASE))
- skb_dst_force(skb);
qdisc_bstats_update(q, skb);
@@ -2798,7 +2791,6 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
rc = NET_XMIT_SUCCESS;
} else {
- skb_dst_force(skb);
rc = q->enqueue(skb, q) & NET_XMIT_MASK;
if (qdisc_run_begin(q)) {
if (unlikely(contended)) {
@@ -2895,6 +2887,14 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
skb_update_prio(skb);
+ /* If device doesn't need skb->dst, release it right now while
+ * its hot in this cpu cache
+ */
+ if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
+ skb_dst_drop(skb);
+ else
+ skb_dst_force(skb);
+
txq = netdev_pick_tx(dev, skb, accel_priv);
q = rcu_dereference_bh(txq->qdisc);
^ permalink raw reply related
* Re: [PATCH net-next] net: Cleanup skb cloning by adding SKB_FCLONE_FREE
From: Vijay Subramanian @ 2014-10-02 16:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, David Miller, Eric Dumazet
In-Reply-To: <1412249728.16704.76.camel@edumazet-glaptop2.roam.corp.google.com>
Sure. Will resend V2 shortly.
Vijay
On 2 October 2014 04:35, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2014-10-01 at 23:33 -0700, Vijay Subramanian wrote:
>> SKB_FCLONE_UNAVAILABLE has overloaded meaning depending on type of skb.
>> 1: If skb is allocated from head_cache, it indicates fclone is not available.
>> 2: If skb is a companion fclone skb (allocated from fclone_cache), it indicates
>> it is available to be used.
>>
>> To avoid confusion for case 2 above, this patch replaces
>> SKB_FCLONE_UNAVAILABLE with SKB_FCLONE_FREE where appropriate. For fclone
>> companion skbs, this indicates it is free for use.
>>
>> SKB_FCLONE_UNAVAILABLE will now simply indicate skb is from head_cache and
>> cannot / will not have a companion fclone.
>>
>> Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
>> ---
>> include/linux/skbuff.h | 3 ++-
>> net/core/skbuff.c | 8 ++++----
>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 7c5036d..6c3fb9a 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -339,9 +339,10 @@ struct skb_shared_info {
>>
>>
>> enum {
>> - SKB_FCLONE_UNAVAILABLE,
>> + SKB_FCLONE_UNAVAILABLE, /* skb has no fclone */
>> SKB_FCLONE_ORIG,
>> SKB_FCLONE_CLONE,
>> + SKB_FCLONE_FREE, /* this fclone skb is available */
>> };
>>
> Please comment all the states, now there is no ambiguity ?
>
>
>
^ permalink raw reply
* [PATCH] net: systemport: fix bcm_sysport_insert_tsb()
From: Florian Fainelli @ 2014-10-02 16:43 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
Similar to commit bc23333ba11fb7f959b7e87e121122f5a0fbbca8 ("net:
bcmgenet: fix bcmgenet_put_tx_csum()"), we need to return the skb
pointer in case we had to reallocate the SKB headroom.
Fixes: 80105befdb4b8 ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 77f1ff7396ac..075688188644 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -857,7 +857,8 @@ static irqreturn_t bcm_sysport_wol_isr(int irq, void *dev_id)
return IRQ_HANDLED;
}
-static int bcm_sysport_insert_tsb(struct sk_buff *skb, struct net_device *dev)
+static struct sk_buff *bcm_sysport_insert_tsb(struct sk_buff *skb,
+ struct net_device *dev)
{
struct sk_buff *nskb;
struct bcm_tsb *tsb;
@@ -873,7 +874,7 @@ static int bcm_sysport_insert_tsb(struct sk_buff *skb, struct net_device *dev)
if (!nskb) {
dev->stats.tx_errors++;
dev->stats.tx_dropped++;
- return -ENOMEM;
+ return NULL;
}
skb = nskb;
}
@@ -892,7 +893,7 @@ static int bcm_sysport_insert_tsb(struct sk_buff *skb, struct net_device *dev)
ip_proto = ipv6_hdr(skb)->nexthdr;
break;
default:
- return 0;
+ return skb;
}
/* Get the checksum offset and the L4 (transport) offset */
@@ -911,7 +912,7 @@ static int bcm_sysport_insert_tsb(struct sk_buff *skb, struct net_device *dev)
tsb->l4_ptr_dest_map = csum_info;
}
- return 0;
+ return skb;
}
static netdev_tx_t bcm_sysport_xmit(struct sk_buff *skb,
@@ -945,8 +946,8 @@ static netdev_tx_t bcm_sysport_xmit(struct sk_buff *skb,
/* Insert TSB and checksum infos */
if (priv->tsb_en) {
- ret = bcm_sysport_insert_tsb(skb, dev);
- if (ret) {
+ skb = bcm_sysport_insert_tsb(skb, dev);
+ if (!skb) {
ret = NETDEV_TX_OK;
goto out;
}
--
1.9.1
^ permalink raw reply related
* Re: [PATCH] drivers/net/dsa/Kconfig: Let NET_DSA_BCM_SF2 depend on HAS_IOMEM
From: Florian Fainelli @ 2014-10-02 16:44 UTC (permalink / raw)
To: Chen Gang, davem, leitec, andrew
Cc: netdev@vger.kernel.org, Richard Weinberger,
linux-kernel@vger.kernel.org
In-Reply-To: <542D5DAC.4010001@gmail.com>
On 10/02/2014 07:14 AM, Chen Gang wrote:
> NET_DSA_BCM_SF2 need HAS_IOMEM, so depend on it, the related error (with
> allmodconfig under um):
>
> CC [M] drivers/net/dsa/bcm_sf2.o
> drivers/net/dsa/bcm_sf2.c: In function ‘bcm_sf2_sw_setup’:
> drivers/net/dsa/bcm_sf2.c:487:3: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
> iounmap(*base);
> ^
>
> Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
> drivers/net/dsa/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
> index ea0697e..9234d80 100644
> --- a/drivers/net/dsa/Kconfig
> +++ b/drivers/net/dsa/Kconfig
> @@ -47,6 +47,7 @@ config NET_DSA_MV88E6171
>
> config NET_DSA_BCM_SF2
> tristate "Broadcom Starfighter 2 Ethernet switch support"
> + depends on HAS_IOMEM
> select NET_DSA
> select NET_DSA_TAG_BRCM
> select FIXED_PHY if NET_DSA_BCM_SF2=y
>
^ permalink raw reply
* Re: [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries
From: Stephen Hemminger @ 2014-10-02 16:46 UTC (permalink / raw)
To: Nicolas Dichtel
Cc: netdev, linux-kernel, davem, ebiederm, akpm, adobriyan, rui.xiang,
viro, oleg, gorcunov, kirill.shutemov, grant.likely, tytso,
Thierry Herbelot
In-Reply-To: <1412263501-6572-3-git-send-email-nicolas.dichtel@6wind.com>
On Thu, 2 Oct 2014 17:25:01 +0200
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> From: Thierry Herbelot <thierry.herbelot@6wind.com>
>
> The current implementation for the directories in /proc is using a single
> linked list. This is slow when handling directories with large numbers of
> entries (eg netdevice-related entries when lots of tunnels are opened).
>
> This patch enables multiple linked lists. A hash based on the entry name is
> used to select the linked list for one given entry.
>
> The speed creation of netdevices is faster as shorter linked lists must be
> scanned when adding a new netdevice.
>
> Here are some numbers:
>
> dummy30000.batch contains 30 000 times 'link add type dummy'.
>
> Before the patch:
> time ip -b dummy30000.batch
> real 2m32.221s
> user 0m0.380s
> sys 2m30.610s
>
> After the patch:
> time ip -b dummy30000.batch
> real 1m57.190s
> user 0m0.350s
> sys 1m56.120s
>
> The single 'subdir' list head is replaced by a subdir hash table. The subdir
> hash buckets are only allocated for directories. The number of hash buckets
> is a compile-time parameter.
>
> For all functions which handle directory entries, an additional check on the
> directory nature of the dir entry ensures that pde_hash_buckets was allocated.
> This check was not needed as subdir was present for all dir entries, whether
> actual directories or simple files.
>
> Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
I think the speed up is a good idea and makes sense.
It would be better to use exist hlist macros for hash table rather than
open coding it.
^ permalink raw reply
* Re: [net-next PATCH V6 0/2] qdisc: bulk dequeue support
From: Florian Westphal @ 2014-10-02 16:52 UTC (permalink / raw)
To: Tom Herbert
Cc: Jesper Dangaard Brouer, Linux Netdev List, David S. Miller,
Eric Dumazet, Hannes Frederic Sowa, Florian Westphal,
Daniel Borkmann, Jamal Hadi Salim, Alexander Duyck,
John Fastabend, Dave Taht, Toke Høiland-Jørgensen
In-Reply-To: <CA+mtBx8ecpd+S+E2o8A+Hzp3i2SO7XO0+F_DYA0ng+fjta7G5A@mail.gmail.com>
Tom Herbert <therbert@google.com> wrote:
> On Wed, Oct 1, 2014 at 1:35 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > This patchset uses DaveM's recent API changes to dev_hard_start_xmit(),
> > from the qdisc layer, to implement dequeue bulking.
> >
> > Patch01: "qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE"
> > - Implement basic qdisc dequeue bulking
> > - This time, 100% relying on BQL limits, no magic safe-guard constants
> >
> > Patch02: "qdisc: dequeue bulking also pickup GSO/TSO packets"
> > - Extend bulking to bulk several GSO/TSO packets
> > - Seperate patch, as it introduce a small regression, see test section.
> >
> > We do have a patch03, which exports a userspace tunable as a BQL
> > tunable, that can byte-cap or disable the bulking/bursting. But we
> > could not agree on it internally, thus not sending it now. We
> > basically strive to avoid adding any new userspace tunable.
> >
> Unfortunately we probably still need something. If BQL were disabled
> (by setting BQL min_limit to infinity) then we'll always dequeue all
> the packets in the qdisc. Disabling BQL might be legitimate in
> deployment if say a bug is found in a device that prevents prompt
> transmit completions for some corner case.
Hmm. Thats confusing.
So you are saying to disable bql one should do
cat limit_max > limit_min ?
But thats not the same as having a bql-unaware driver.
Seems to get same behavior as non-bql aware driver (where
dql_avail always returns 0 since num_queued and adj_limit remain at 0)
is to set
echo 0 > limit_max
... which makes dql_avail() return 0, which then also turns off bulk
dequeue.
Confused,
Florian
^ permalink raw reply
* [PATCH V2 net-next] net: Cleanup skb cloning by adding SKB_FCLONE_FREE
From: Vijay Subramanian @ 2014-10-02 17:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, Vijay Subramanian
SKB_FCLONE_UNAVAILABLE has overloaded meaning depending on type of skb.
1: If skb is allocated from head_cache, it indicates fclone is not available.
2: If skb is a companion fclone skb (allocated from fclone_cache), it indicates
it is available to be used.
To avoid confusion for case 2 above, this patch replaces
SKB_FCLONE_UNAVAILABLE with SKB_FCLONE_FREE where appropriate. For fclone
companion skbs, this indicates it is free for use.
SKB_FCLONE_UNAVAILABLE will now simply indicate skb is from head_cache and
cannot / will not have a companion fclone.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
---
V1-->V2: Comment all states
include/linux/skbuff.h | 7 ++++---
net/core/skbuff.c | 8 ++++----
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7c5036d..3a5ec76 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -339,9 +339,10 @@ struct skb_shared_info {
enum {
- SKB_FCLONE_UNAVAILABLE,
- SKB_FCLONE_ORIG,
- SKB_FCLONE_CLONE,
+ SKB_FCLONE_UNAVAILABLE, /* skb has no fclone (from head_cache) */
+ SKB_FCLONE_ORIG, /* orig skb (from fclone_cache) */
+ SKB_FCLONE_CLONE, /* companion fclone skb (from fclone_cache) */
+ SKB_FCLONE_FREE, /* this companion fclone skb is available */
};
enum {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f77e648..6f4e359 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -265,7 +265,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
skb->fclone = SKB_FCLONE_ORIG;
atomic_set(&fclones->fclone_ref, 1);
- fclones->skb2.fclone = SKB_FCLONE_UNAVAILABLE;
+ fclones->skb2.fclone = SKB_FCLONE_FREE;
fclones->skb2.pfmemalloc = pfmemalloc;
}
out:
@@ -542,7 +542,7 @@ static void kfree_skbmem(struct sk_buff *skb)
fclones = container_of(skb, struct sk_buff_fclones, skb2);
/* Warning : We must perform the atomic_dec_and_test() before
- * setting skb->fclone back to SKB_FCLONE_UNAVAILABLE, otherwise
+ * setting skb->fclone back to SKB_FCLONE_FREE, otherwise
* skb_clone() could set clone_ref to 2 before our decrement.
* Anyway, if we are going to free the structure, no need to
* rewrite skb->fclone.
@@ -553,7 +553,7 @@ static void kfree_skbmem(struct sk_buff *skb)
/* The clone portion is available for
* fast-cloning again.
*/
- skb->fclone = SKB_FCLONE_UNAVAILABLE;
+ skb->fclone = SKB_FCLONE_FREE;
}
break;
}
@@ -874,7 +874,7 @@ struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t gfp_mask)
return NULL;
if (skb->fclone == SKB_FCLONE_ORIG &&
- n->fclone == SKB_FCLONE_UNAVAILABLE) {
+ n->fclone == SKB_FCLONE_FREE) {
n->fclone = SKB_FCLONE_CLONE;
/* As our fastclone was free, clone_ref must be 1 at this point.
* We could use atomic_inc() here, but it is faster
--
1.9.1
^ permalink raw reply related
* [PATCH 0/2 NEXT] Some late fixes for rtlwifi
From: Larry Finger @ 2014-10-02 17:00 UTC (permalink / raw)
To: linville; +Cc: linux-wireless, troy_tan, Larry Finger, netdev
These patches fix a Kconfig error for RTL8192EE, and some static checker warnings
reported by Dan Carpenter.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
Larry Finger (2):
rtlwifi: Fix Kconfig for RTL8192EE
rtlwifi: Fix static checker warnings for various drivers
drivers/net/wireless/rtlwifi/Kconfig | 3 ++-
drivers/net/wireless/rtlwifi/rtl8188ee/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8192ce/trx.c | 4 ----
drivers/net/wireless/rtlwifi/rtl8192ee/hw.c | 24 ++++++++++++------------
drivers/net/wireless/rtlwifi/rtl8192ee/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8192se/trx.c | 4 ----
drivers/net/wireless/rtlwifi/rtl8723ae/trx.c | 6 ------
drivers/net/wireless/rtlwifi/rtl8723be/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8821ae/trx.c | 7 -------
9 files changed, 14 insertions(+), 55 deletions(-)
--
1.8.4.5
^ permalink raw reply
* [PATCH 1/2 NEXT] rtlwifi: Fix Kconfig for RTL8192EE
From: Larry Finger @ 2014-10-02 17:00 UTC (permalink / raw)
To: linville; +Cc: linux-wireless, troy_tan, Larry Finger, netdev
In-Reply-To: <1412269254-19983-1-git-send-email-Larry.Finger@lwfinger.net>
The driver needs btcoexist, but Kconfig fails to select it. This omission
could cause build errors for some configurations.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
drivers/net/wireless/rtlwifi/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/rtlwifi/Kconfig b/drivers/net/wireless/rtlwifi/Kconfig
index 552742e..5cf509d 100644
--- a/drivers/net/wireless/rtlwifi/Kconfig
+++ b/drivers/net/wireless/rtlwifi/Kconfig
@@ -86,6 +86,7 @@ config RTL8192EE
depends on PCI
select RTLWIFI
select RTLWIFI_PCI
+ select RTLBTCOEXIST
---help---
This is the driver for Realtek RTL8192EE 802.11n PCIe
wireless network adapters.
@@ -147,7 +148,7 @@ config RTL8723_COMMON
config RTLBTCOEXIST
tristate
- depends on RTL8723AE || RTL8723BE || RTL8821AE
+ depends on RTL8723AE || RTL8723BE || RTL8821AE || RTL8192EE
default y
endif
--
1.8.4.5
^ permalink raw reply related
* [PATCH 2/2 NEXT] rtlwifi: Fix static checker warnings for various drivers
From: Larry Finger @ 2014-10-02 17:00 UTC (permalink / raw)
To: linville; +Cc: linux-wireless, troy_tan, Larry Finger, netdev, Dan Carpenter
In-Reply-To: <1412269254-19983-1-git-send-email-Larry.Finger@lwfinger.net>
Indenting errors yielded the following static checker warnings:
drivers/net/wireless/rtlwifi/rtl8192ee/hw.c:533 rtl92ee_set_hw_reg() warn: add curly braces? (if)
drivers/net/wireless/rtlwifi/rtl8192ee/hw.c:539 rtl92ee_set_hw_reg() warn: add curly braces? (if)
An unreleased version of the static checker also reported:
drivers/net/wireless/rtlwifi/rtl8723be/trx.c:550 rtl8723be_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8188ee/trx.c:621 rtl88ee_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8192ee/trx.c:567 rtl92ee_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8821ae/trx.c:758 rtl8821ae_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8723ae/trx.c:494 rtl8723e_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8192se/trx.c:315 rtl92se_rx_query_desc() warn: 'hdr' can't be NULL.
drivers/net/wireless/rtlwifi/rtl8192ce/trx.c:392 rtl92ce_rx_query_desc() warn: 'hdr' can't be NULL.
All of these are fixed.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
---
drivers/net/wireless/rtlwifi/rtl8188ee/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8192ce/trx.c | 4 ----
drivers/net/wireless/rtlwifi/rtl8192ee/hw.c | 24 ++++++++++++------------
drivers/net/wireless/rtlwifi/rtl8192ee/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8192se/trx.c | 4 ----
drivers/net/wireless/rtlwifi/rtl8723ae/trx.c | 6 ------
drivers/net/wireless/rtlwifi/rtl8723be/trx.c | 7 -------
drivers/net/wireless/rtlwifi/rtl8821ae/trx.c | 7 -------
8 files changed, 12 insertions(+), 54 deletions(-)
diff --git a/drivers/net/wireless/rtlwifi/rtl8188ee/trx.c b/drivers/net/wireless/rtlwifi/rtl8188ee/trx.c
index cf56ec8..df549c9 100644
--- a/drivers/net/wireless/rtlwifi/rtl8188ee/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8188ee/trx.c
@@ -618,13 +618,6 @@ bool rtl88ee_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (status->decrypted) {
- if (!hdr) {
- WARN_ON_ONCE(true);
- pr_err("decrypted is true but hdr NULL, from skb %p\n",
- rtl_get_hdr(skb));
- return false;
- }
-
if ((!_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag |= RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c b/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
index c140123..2fb9c7a 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
@@ -389,10 +389,6 @@ bool rtl92ce_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (stats->decrypted) {
- if (!hdr) {
- /* In testing, hdr was NULL here */
- return false;
- }
if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag &= ~RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c b/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
index 85d0d58..dfdc9b2 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
@@ -530,18 +530,18 @@ void rtl92ee_set_hw_reg(struct ieee80211_hw *hw, u8 variable, u8 *val)
fac = (1 << (fac + 2));
if (fac > 0xf)
fac = 0xf;
- for (i = 0; i < 4; i++) {
- if ((reg[i] & 0xf0) > (fac << 4))
- reg[i] = (reg[i] & 0x0f) |
- (fac << 4);
- if ((reg[i] & 0x0f) > fac)
- reg[i] = (reg[i] & 0xf0) | fac;
- rtl_write_byte(rtlpriv,
- (REG_AGGLEN_LMT + i),
- reg[i]);
- }
- RT_TRACE(rtlpriv, COMP_MLME, DBG_LOUD,
- "Set HW_VAR_AMPDU_FACTOR:%#x\n", fac);
+ for (i = 0; i < 4; i++) {
+ if ((reg[i] & 0xf0) > (fac << 4))
+ reg[i] = (reg[i] & 0x0f) |
+ (fac << 4);
+ if ((reg[i] & 0x0f) > fac)
+ reg[i] = (reg[i] & 0xf0) | fac;
+ rtl_write_byte(rtlpriv,
+ (REG_AGGLEN_LMT + i),
+ reg[i]);
+ }
+ RT_TRACE(rtlpriv, COMP_MLME, DBG_LOUD,
+ "Set HW_VAR_AMPDU_FACTOR:%#x\n", fac);
}
}
break;
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
index 83edd95..2fcbef1 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
@@ -564,13 +564,6 @@ bool rtl92ee_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (status->decrypted) {
- if (!hdr) {
- WARN_ON_ONCE(true);
- pr_err("decrypted is true but hdr NULL, from skb %p\n",
- rtl_get_hdr(skb));
- return false;
- }
-
if ((!_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag |= RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/trx.c b/drivers/net/wireless/rtlwifi/rtl8192se/trx.c
index 2b3c78b..b358ebc 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/trx.c
@@ -312,10 +312,6 @@ bool rtl92se_rx_query_desc(struct ieee80211_hw *hw, struct rtl_stats *stats,
hdr = (struct ieee80211_hdr *)(skb->data +
stats->rx_drvinfo_size + stats->rx_bufshift);
- if (!hdr) {
- /* during testing, hdr was NULL here */
- return false;
- }
if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag &= ~RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8723ae/trx.c b/drivers/net/wireless/rtlwifi/rtl8723ae/trx.c
index 1da2367..d372cca 100644
--- a/drivers/net/wireless/rtlwifi/rtl8723ae/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8723ae/trx.c
@@ -491,12 +491,6 @@ bool rtl8723e_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (status->decrypted) {
- if (!hdr) {
- WARN_ON_ONCE(true);
- pr_err("decrypted is true but hdr NULL, from skb %p\n",
- rtl_get_hdr(skb));
- return false;
- }
if ((!_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag |= RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8723be/trx.c b/drivers/net/wireless/rtlwifi/rtl8723be/trx.c
index 9679cd2..d6a1c70 100644
--- a/drivers/net/wireless/rtlwifi/rtl8723be/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8723be/trx.c
@@ -547,13 +547,6 @@ bool rtl8723be_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (status->decrypted) {
- if (!hdr) {
- WARN_ON_ONCE(true);
- pr_err("decrypted is true but hdr NULL, from skb %p\n",
- rtl_get_hdr(skb));
- return false;
- }
-
if ((!_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag |= RX_FLAG_DECRYPTED;
diff --git a/drivers/net/wireless/rtlwifi/rtl8821ae/trx.c b/drivers/net/wireless/rtlwifi/rtl8821ae/trx.c
index 7ece0ef..383b86b 100644
--- a/drivers/net/wireless/rtlwifi/rtl8821ae/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8821ae/trx.c
@@ -755,13 +755,6 @@ bool rtl8821ae_rx_query_desc(struct ieee80211_hw *hw,
* to decrypt it
*/
if (status->decrypted) {
- if (!hdr) {
- WARN_ON_ONCE(true);
- pr_err("decrypted is true but hdr NULL, from skb %p\n",
- rtl_get_hdr(skb));
- return false;
- }
-
if ((!_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag |= RX_FLAG_DECRYPTED;
--
1.8.4.5
^ permalink raw reply related
* phpBB 3.1.0 new version
From: phpbbaid @ 2014-10-02 17:05 UTC (permalink / raw)
To: netdev
phpBB 3.1.0 new version is out .
Please update your forum to the latest version .
We provide paid support if you are interested, please, reply to this email
Thank you
^ permalink raw reply
* Re: [PATCH net-next v6 2/2] bonding: Simplify the xmit function for modes that use xmit_hash
From: Mahesh Bandewar @ 2014-10-02 17:19 UTC (permalink / raw)
To: Cong Wang
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David Miller,
netdev, Eric Dumazet, Maciej Zenczykowski
In-Reply-To: <CAHA+R7PiP2Ce-+3S6Uy5kkKjk1sN6uE2gXzvL_HtfnyVHLoKoQ@mail.gmail.com>
On Thu, Oct 2, 2014 at 10:10 AM, Cong Wang <cwang@twopensource.com> wrote:
> On Wed, Oct 1, 2014 at 1:38 AM, Mahesh Bandewar <maheshb@google.com> wrote:
>> +int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>> +{
>> + struct slave *slave;
>> + struct list_head *iter;
>> + struct bond_up_slave *new_arr, *old_arr;
>> + int slaves_in_agg;
>> + int agg_id = 0;
>> + int ret = 0;
>> +
>> +#ifdef CONFIG_LOCKDEP
>> + WARN_ON(lockdep_is_held(&bond->mode_lock));
>> +#endif
>
>
> I think you can use lockdep_is_held().
>
>> +
>> + new_arr = kzalloc(offsetof(struct bond_up_slave, arr[bond->slave_cnt]),
>> + GFP_KERNEL);
>> + if (!new_arr) {
>> + ret = -ENOMEM;
>> + pr_err("Failed to build slave-array.\n");
>> + goto out;
>> + }
>
>
> No need to print an error message for OOM, it is already noisy. :)
>
Agreed OOM condition is noisy but failing silently would not help
debugging hence adding the message.
>
>> + if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>> + struct ad_info ad_info;
>> +
>> + if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
>> + pr_debug("bond_3ad_get_active_agg_info failed\n");
>
>
> I suspect how useful this debug info is since your patch is almost ready
> to merge.
>
It could be useful for someone else too :)
>> + kfree_rcu(new_arr, rcu);
>> + /* No active aggragator means its not safe to use
>
> s/its/it's/
>
thanks
>> + * the previous array.
>> + */
>> + old_arr = rtnl_dereference(bond->slave_arr);
>> + if (old_arr) {
>> + RCU_INIT_POINTER(bond->slave_arr, NULL);
>> + kfree_rcu(old_arr, rcu);
>> + }
>> + goto out;
>> + }
>> + slaves_in_agg = ad_info.ports;
>> + agg_id = ad_info.aggregator_id;
>> + }
>
>
>
> Thanks.
^ permalink raw reply
* Re: [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory entries
From: Alexey Dobriyan @ 2014-10-02 17:28 UTC (permalink / raw)
To: Nicolas Dichtel
Cc: netdev, linux-kernel, davem, ebiederm, akpm, rui.xiang, viro,
oleg, gorcunov, kirill.shutemov, grant.likely, tytso,
Thierry Herbelot
In-Reply-To: <1412263501-6572-3-git-send-email-nicolas.dichtel@6wind.com>
On Thu, Oct 02, 2014 at 05:25:01PM +0200, Nicolas Dichtel wrote:
> +static inline unsigned int proc_pde_name_hash(const unsigned char *name,
> + const unsigned int len)
> +{
> + return full_name_hash(name, len) & PROC_PDE_HASH_MASK;
> +}
PDE already stands for "proc dir entry" :^)
Alexey
^ permalink raw reply
* Re: [PATCH net-next v6 2/2] bonding: Simplify the xmit function for modes that use xmit_hash
From: Mahesh Bandewar @ 2014-10-02 17:28 UTC (permalink / raw)
To: David Laight
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David Miller,
netdev, Eric Dumazet, Maciej Zenczykowski
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D174A2CF5@AcuExch.aculab.com>
On Thu, Oct 2, 2014 at 2:42 PM, David Laight <David.Laight@aculab.com> wrote:
> From: Mahesh Bandewar
>> On Wed, Oct 1, 2014 at 9:49 PM, Jay Vosburgh <jay.vosburgh@canonical.com> wrote:
>> > Mahesh Bandewar <maheshb@google.com> wrote:
> ...
>> >> * Select aggregation groups, and assign each port for it's aggregetor. The
>> >> * selection logic is called in the inititalization (after all the handshkes),
>> >> * and after every lacpdu receive (if selected is off).
>> >> */
>> >>-static void ad_port_selection_logic(struct port *port)
>> >>+static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
>> >
>> > Since this function is void, why not have it return a value
>> > instead of the bool *update_slave_arr? That would eliminate the need
>> > for some call sites to pass a "dummy" to the function. This comment
>> > applies to ad_agg_selection_logic and ad_enable_collecting_distributing
>> > as well.
>> >
>> Yes, I had similar discussion with Nik earlier and overloading the
>> return value did not feel clean and future-proof and hence decided to
>> take this approach.
>
> What overload?
> Returning values by reference parameters isn't really a good idea.
> It kills performance and optimisations.
> If you ever need a second return value then solve the problem then.
>
Please show me how much performance we are loosing by taking this
approach... otherwise this argument is bogus!
> David
>
>
>
^ permalink raw reply
* Re: [net-next PATCH V6 0/2] qdisc: bulk dequeue support
From: Eric Dumazet @ 2014-10-02 17:32 UTC (permalink / raw)
To: Florian Westphal
Cc: Tom Herbert, Jesper Dangaard Brouer, Linux Netdev List,
David S. Miller, Hannes Frederic Sowa, Daniel Borkmann,
Jamal Hadi Salim, Alexander Duyck, John Fastabend, Dave Taht,
Toke Høiland-Jørgensen
In-Reply-To: <20141002165215.GJ1803@breakpoint.cc>
On Thu, 2014-10-02 at 18:52 +0200, Florian Westphal wrote:
> Tom Herbert <therbert@google.com> wrote:
> > On Wed, Oct 1, 2014 at 1:35 PM, Jesper Dangaard Brouer
> > <brouer@redhat.com> wrote:
> > > This patchset uses DaveM's recent API changes to dev_hard_start_xmit(),
> > > from the qdisc layer, to implement dequeue bulking.
> > >
> > > Patch01: "qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE"
> > > - Implement basic qdisc dequeue bulking
> > > - This time, 100% relying on BQL limits, no magic safe-guard constants
> > >
> > > Patch02: "qdisc: dequeue bulking also pickup GSO/TSO packets"
> > > - Extend bulking to bulk several GSO/TSO packets
> > > - Seperate patch, as it introduce a small regression, see test section.
> > >
> > > We do have a patch03, which exports a userspace tunable as a BQL
> > > tunable, that can byte-cap or disable the bulking/bursting. But we
> > > could not agree on it internally, thus not sending it now. We
> > > basically strive to avoid adding any new userspace tunable.
> > >
> > Unfortunately we probably still need something. If BQL were disabled
> > (by setting BQL min_limit to infinity) then we'll always dequeue all
> > the packets in the qdisc. Disabling BQL might be legitimate in
> > deployment if say a bug is found in a device that prevents prompt
> > transmit completions for some corner case.
>
> Hmm. Thats confusing.
>
> So you are saying to disable bql one should do
>
> cat limit_max > limit_min ?
>
> But thats not the same as having a bql-unaware driver.
>
> Seems to get same behavior as non-bql aware driver (where
> dql_avail always returns 0 since num_queued and adj_limit remain at 0)
> is to set
>
> echo 0 > limit_max
>
> ... which makes dql_avail() return 0, which then also turns off bulk
> dequeue.
So Tom point is that we might have a BQL enabled driver, but for some
reason admin wants to set limit_min to 10000000.
Then the admin wants also to restrict the xmit_more batches to 13
packets.
Then a debugging facility might be nice to have.
What about adding 2 attributes :
/sys/class/net/ethX/queues/tx-Y/xmit_more_inflight (live number of
deferred frames on this TX queue, waiting a doorbell to kick the NIC)
/sys/class/net/ethX/queues/tx-Y/xmit_more_limit (default to 8)
^ permalink raw reply
* Re: [net-next PATCH V6 0/2] qdisc: bulk dequeue support
From: Tom Herbert @ 2014-10-02 17:35 UTC (permalink / raw)
To: Florian Westphal
Cc: Jesper Dangaard Brouer, Linux Netdev List, David S. Miller,
Eric Dumazet, Hannes Frederic Sowa, Daniel Borkmann,
Jamal Hadi Salim, Alexander Duyck, John Fastabend, Dave Taht,
Toke Høiland-Jørgensen
In-Reply-To: <20141002165215.GJ1803@breakpoint.cc>
On Thu, Oct 2, 2014 at 9:52 AM, Florian Westphal <fw@strlen.de> wrote:
> Tom Herbert <therbert@google.com> wrote:
>> On Wed, Oct 1, 2014 at 1:35 PM, Jesper Dangaard Brouer
>> <brouer@redhat.com> wrote:
>> > This patchset uses DaveM's recent API changes to dev_hard_start_xmit(),
>> > from the qdisc layer, to implement dequeue bulking.
>> >
>> > Patch01: "qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE"
>> > - Implement basic qdisc dequeue bulking
>> > - This time, 100% relying on BQL limits, no magic safe-guard constants
>> >
>> > Patch02: "qdisc: dequeue bulking also pickup GSO/TSO packets"
>> > - Extend bulking to bulk several GSO/TSO packets
>> > - Seperate patch, as it introduce a small regression, see test section.
>> >
>> > We do have a patch03, which exports a userspace tunable as a BQL
>> > tunable, that can byte-cap or disable the bulking/bursting. But we
>> > could not agree on it internally, thus not sending it now. We
>> > basically strive to avoid adding any new userspace tunable.
>> >
>> Unfortunately we probably still need something. If BQL were disabled
>> (by setting BQL min_limit to infinity) then we'll always dequeue all
>> the packets in the qdisc. Disabling BQL might be legitimate in
>> deployment if say a bug is found in a device that prevents prompt
>> transmit completions for some corner case.
>
> Hmm. Thats confusing.
>
> So you are saying to disable bql one should do
>
> cat limit_max > limit_min ?
>
echo max > limit_min
echo max > limit_max
"Disables" BQL by forcing the limit to be really big.
> But thats not the same as having a bql-unaware driver.
>
Yes, that's true. This does not disable the accounting and limit check
which is where a bug would manifest itself.
> Seems to get same behavior as non-bql aware driver (where
> dql_avail always returns 0 since num_queued and adj_limit remain at 0)
> is to set
>
That doesn't work for BQL, if dql_avail returning zero means we can
queue only one packet.
> echo 0 > limit_max
>
> ... which makes dql_avail() return 0, which then also turns off bulk
> dequeue.
>
> Confused,
> Florian
I withdraw my comment.
^ permalink raw reply
* <Reply ASAP>
From: Mr. James @ 2014-10-02 8:54 UTC (permalink / raw)
To: Recipients
Sequel to your non-response to my previous email, I am re-sending this to you again thus; A deceased client of mine who died of a heart-related ailment about 3 years ago left behind some funds which I want you to assist in retriving and distributing. Reply so I can give you details on what is needed to be done.
Regards,
James.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox