* [PATCH AUTOSEL 4.19 013/123] tools: bpftool: prevent infinite loop in get_fdinfo()
[not found] <20181205093555.5386-1-sashal@kernel.org>
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 017/123] netfilter: nf_conncount: use spin_lock_bh instead of spin_lock Sasha Levin
` (24 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel; +Cc: Quentin Monnet, Daniel Borkmann, Sasha Levin, netdev
From: Quentin Monnet <quentin.monnet@netronome.com>
[ Upstream commit 53909030aa29bffe1f8490df62176c2375135652 ]
Function getline() returns -1 on failure to read a line, thus creating
an infinite loop in get_fdinfo() if the key is not found. Fix it by
calling the function only as long as we get a strictly positive return
value.
Found by copying the code for a key which is not always present...
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/bpf/bpftool/common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index b3a0709ea7ed..fcaf00621102 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -304,7 +304,7 @@ char *get_fdinfo(int fd, const char *key)
return NULL;
}
- while ((n = getline(&line, &line_n, fdi))) {
+ while ((n = getline(&line, &line_n, fdi)) > 0) {
char *value;
int len;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 017/123] netfilter: nf_conncount: use spin_lock_bh instead of spin_lock
[not found] <20181205093555.5386-1-sashal@kernel.org>
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 013/123] tools: bpftool: prevent infinite loop in get_fdinfo() Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 018/123] netfilter: nf_conncount: fix list_del corruption in conn_free Sasha Levin
` (23 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit fd3e71a9f71e232181a225301a75936373636ccc ]
conn_free() holds lock with spin_lock() and it is called by both
nf_conncount_lookup() and nf_conncount_gc_list(). nf_conncount_lookup()
is called from bottom-half context and nf_conncount_gc_list() from
process context. So that spin_lock() call is not safe. Hence
conn_free() should use spin_lock_bh() instead of spin_lock().
test commands:
%nft add table ip filter
%nft add chain ip filter input { type filter hook input priority 0\; }
%nft add rule filter input meter test { ip saddr ct count over 2 } \
counter
splat looks like:
[ 461.996507] ================================
[ 461.998999] WARNING: inconsistent lock state
[ 461.998999] 4.19.0-rc6+ #22 Not tainted
[ 461.998999] --------------------------------
[ 461.998999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 461.998999] kworker/0:2/134 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 461.998999] 00000000a71a559a (&(&list->list_lock)->rlock){+.?.}, at: conn_free+0x69/0x2b0 [nf_conncount]
[ 461.998999] {IN-SOFTIRQ-W} state was registered at:
[ 461.998999] _raw_spin_lock+0x30/0x70
[ 461.998999] nf_conncount_add+0x28a/0x520 [nf_conncount]
[ 461.998999] nft_connlimit_eval+0x401/0x580 [nft_connlimit]
[ 461.998999] nft_dynset_eval+0x32b/0x590 [nf_tables]
[ 461.998999] nft_do_chain+0x497/0x1430 [nf_tables]
[ 461.998999] nft_do_chain_ipv4+0x255/0x330 [nf_tables]
[ 461.998999] nf_hook_slow+0xb1/0x160
[ ... ]
[ 461.998999] other info that might help us debug this:
[ 461.998999] Possible unsafe locking scenario:
[ 461.998999]
[ 461.998999] CPU0
[ 461.998999] ----
[ 461.998999] lock(&(&list->list_lock)->rlock);
[ 461.998999] <Interrupt>
[ 461.998999] lock(&(&list->list_lock)->rlock);
[ 461.998999]
[ 461.998999] *** DEADLOCK ***
[ 461.998999]
[ ... ]
Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_conncount.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 02ca7df793f5..71b1f4f99580 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -106,15 +106,15 @@ nf_conncount_add(struct nf_conncount_list *list,
conn->zone = *zone;
conn->cpu = raw_smp_processor_id();
conn->jiffies32 = (u32)jiffies;
- spin_lock(&list->list_lock);
+ spin_lock_bh(&list->list_lock);
if (list->dead == true) {
kmem_cache_free(conncount_conn_cachep, conn);
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
return NF_CONNCOUNT_SKIP;
}
list_add_tail(&conn->node, &list->head);
list->count++;
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
return NF_CONNCOUNT_ADDED;
}
EXPORT_SYMBOL_GPL(nf_conncount_add);
@@ -132,10 +132,10 @@ static bool conn_free(struct nf_conncount_list *list,
{
bool free_entry = false;
- spin_lock(&list->list_lock);
+ spin_lock_bh(&list->list_lock);
if (list->count == 0) {
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
return free_entry;
}
@@ -144,7 +144,7 @@ static bool conn_free(struct nf_conncount_list *list,
if (list->count == 0)
free_entry = true;
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
call_rcu(&conn->rcu_head, __conn_free);
return free_entry;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 018/123] netfilter: nf_conncount: fix list_del corruption in conn_free
[not found] <20181205093555.5386-1-sashal@kernel.org>
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 013/123] tools: bpftool: prevent infinite loop in get_fdinfo() Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 017/123] netfilter: nf_conncount: use spin_lock_bh instead of spin_lock Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 019/123] netfilter: nf_conncount: fix unexpected permanent node of list Sasha Levin
` (22 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 31568ec09ea02a050249921698c9729419539cce ]
nf_conncount_tuple is an element of nft_connlimit and that is deleted by
conn_free(). Elements can be deleted by both GC routine and data path
functions (nf_conncount_lookup, nf_conncount_add) and they call
conn_free() to free elements. But conn_free() only protects lists, not
each element. So that list_del corruption could occurred.
The conn_free() doesn't check whether element is already deleted. In
order to protect elements, dead flag is added. If an element is deleted,
dead flag is set. The only conn_free() can delete elements so that both
list lock and dead flag are enough to protect it.
test commands:
%nft add table ip filter
%nft add chain ip filter input { type filter hook input priority 0\; }
%nft add rule filter input meter test { ip id ct count over 2 } counter
splat looks like:
[ 1779.495778] list_del corruption, ffff8800b6e12008->prev is LIST_POISON2 (dead000000000200)
[ 1779.505453] ------------[ cut here ]------------
[ 1779.506260] kernel BUG at lib/list_debug.c:50!
[ 1779.515831] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 1779.516772] CPU: 0 PID: 33 Comm: kworker/0:2 Not tainted 4.19.0-rc6+ #22
[ 1779.516772] Workqueue: events_power_efficient nft_rhash_gc [nf_tables_set]
[ 1779.516772] RIP: 0010:__list_del_entry_valid+0xd8/0x150
[ 1779.516772] Code: 39 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 89 ea 48 c7 c7 00 c3 5b 98 e8 0f dc 40 ff 0f 0b 48 c7 c7 60 c3 5b 98 e8 01 dc 40 ff <0f> 0b 48 c7 c7 c0 c3 5b 98 e8 f3 db 40 ff 0f 0b 48 c7 c7 20 c4 5b
[ 1779.516772] RSP: 0018:ffff880119127420 EFLAGS: 00010286
[ 1779.516772] RAX: 000000000000004e RBX: dead000000000200 RCX: 0000000000000000
[ 1779.516772] RDX: 000000000000004e RSI: 0000000000000008 RDI: ffffed0023224e7a
[ 1779.516772] RBP: ffff88011934bc10 R08: ffffed002367cea9 R09: ffffed002367cea9
[ 1779.516772] R10: 0000000000000001 R11: ffffed002367cea8 R12: ffff8800b6e12008
[ 1779.516772] R13: ffff8800b6e12010 R14: ffff88011934bc20 R15: ffff8800b6e12008
[ 1779.516772] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 1779.516772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1779.516772] CR2: 00007fc876534010 CR3: 000000010da16000 CR4: 00000000001006f0
[ 1779.516772] Call Trace:
[ 1779.516772] conn_free+0x9f/0x2b0 [nf_conncount]
[ 1779.516772] ? nf_ct_tmpl_alloc+0x2a0/0x2a0 [nf_conntrack]
[ 1779.516772] ? nf_conncount_add+0x520/0x520 [nf_conncount]
[ 1779.516772] ? do_raw_spin_trylock+0x1a0/0x1a0
[ 1779.516772] ? do_raw_spin_trylock+0x10/0x1a0
[ 1779.516772] find_or_evict+0xe5/0x150 [nf_conncount]
[ 1779.516772] nf_conncount_gc_list+0x162/0x360 [nf_conncount]
[ 1779.516772] ? nf_conncount_lookup+0xee0/0xee0 [nf_conncount]
[ 1779.516772] ? _raw_spin_unlock_irqrestore+0x45/0x50
[ 1779.516772] ? trace_hardirqs_off+0x6b/0x220
[ 1779.516772] ? trace_hardirqs_on_caller+0x220/0x220
[ 1779.516772] nft_rhash_gc+0x16b/0x540 [nf_tables_set]
[ ... ]
Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_conncount.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 71b1f4f99580..cb33709138df 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -49,6 +49,7 @@ struct nf_conncount_tuple {
struct nf_conntrack_zone zone;
int cpu;
u32 jiffies32;
+ bool dead;
struct rcu_head rcu_head;
};
@@ -106,6 +107,7 @@ nf_conncount_add(struct nf_conncount_list *list,
conn->zone = *zone;
conn->cpu = raw_smp_processor_id();
conn->jiffies32 = (u32)jiffies;
+ conn->dead = false;
spin_lock_bh(&list->list_lock);
if (list->dead == true) {
kmem_cache_free(conncount_conn_cachep, conn);
@@ -134,12 +136,13 @@ static bool conn_free(struct nf_conncount_list *list,
spin_lock_bh(&list->list_lock);
- if (list->count == 0) {
+ if (conn->dead) {
spin_unlock_bh(&list->list_lock);
- return free_entry;
+ return free_entry;
}
list->count--;
+ conn->dead = true;
list_del_rcu(&conn->node);
if (list->count == 0)
free_entry = true;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 019/123] netfilter: nf_conncount: fix unexpected permanent node of list.
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (2 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 018/123] netfilter: nf_conncount: fix list_del corruption in conn_free Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 020/123] netfilter: nf_tables: don't skip inactive chains during update Sasha Levin
` (21 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 3c5cdb17c3be76714dfd0d03e384f70579545614 ]
When list->count is 0, the list is deleted by GC. But list->count is
never reached 0 because initial count value is 1 and it is increased
when node is inserted. So that initial value of list->count should be 0.
Originally GC always finds zero count list through deleting node and
decreasing count. However, list may be left empty since node insertion
may fail eg. allocaton problem. In order to solve this problem, GC
routine also finds zero count list without deleting node.
Fixes: cb2b36f5a97d ("netfilter: nf_conncount: Switch to plain list")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_conncount.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index cb33709138df..8acae4a3e4c0 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -144,8 +144,10 @@ static bool conn_free(struct nf_conncount_list *list,
list->count--;
conn->dead = true;
list_del_rcu(&conn->node);
- if (list->count == 0)
+ if (list->count == 0) {
+ list->dead = true;
free_entry = true;
+ }
spin_unlock_bh(&list->list_lock);
call_rcu(&conn->rcu_head, __conn_free);
@@ -248,7 +250,7 @@ void nf_conncount_list_init(struct nf_conncount_list *list)
{
spin_lock_init(&list->list_lock);
INIT_LIST_HEAD(&list->head);
- list->count = 1;
+ list->count = 0;
list->dead = false;
}
EXPORT_SYMBOL_GPL(nf_conncount_list_init);
@@ -262,6 +264,7 @@ bool nf_conncount_gc_list(struct net *net,
struct nf_conn *found_ct;
unsigned int collected = 0;
bool free_entry = false;
+ bool ret = false;
list_for_each_entry_safe(conn, conn_n, &list->head, node) {
found = find_or_evict(net, list, conn, &free_entry);
@@ -291,7 +294,15 @@ bool nf_conncount_gc_list(struct net *net,
if (collected > CONNCOUNT_GC_MAX_NODES)
return false;
}
- return false;
+
+ spin_lock_bh(&list->list_lock);
+ if (!list->count) {
+ list->dead = true;
+ ret = true;
+ }
+ spin_unlock_bh(&list->list_lock);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_conncount_gc_list);
@@ -417,6 +428,7 @@ insert_tree(struct net *net,
nf_conncount_list_init(&rbconn->list);
list_add(&conn->node, &rbconn->list.head);
count = 1;
+ rbconn->list.count = count;
rb_link_node(&rbconn->node, parent, rbnode);
rb_insert_color(&rbconn->node, root);
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 020/123] netfilter: nf_tables: don't skip inactive chains during update
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (3 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 019/123] netfilter: nf_conncount: fix unexpected permanent node of list Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 023/123] netfilter: xt_RATEEST: remove netns exit routine Sasha Levin
` (20 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Florian Westphal, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Florian Westphal <fw@strlen.de>
[ Upstream commit 0fb39bbe43d4481fcf300d2b5822de60942fd189 ]
There is no synchronization between packet path and the configuration plane.
The packet path uses two arrays with rules, one contains the current (active)
generation. The other either contains the last (obsolete) generation or
the future one.
Consider:
cpu1 cpu2
nft_do_chain(c);
delete c
net->gen++;
genbit = !!net->gen;
rules = c->rg[genbit];
cpu1 ignores c when updating if c is not active anymore in the new
generation.
On cpu2, we now use rules from wrong generation, as c->rg[old]
contains the rules matching 'c' whereas c->rg[new] was not updated and
can even point to rules that have been free'd already, causing a crash.
To fix this, make sure that 'current' to the 'next' generation are
identical for chains that are going away so that c->rg[new] will just
use the matching rules even if genbit was incremented already.
Fixes: 0cbc06b3faba7 ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_tables_api.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 2cfb173cd0b2..4c016b49fe2b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -6277,7 +6277,7 @@ static void nf_tables_commit_chain_free_rules_old(struct nft_rule **rules)
call_rcu(&old->h, __nf_tables_commit_chain_free_rules_old);
}
-static void nf_tables_commit_chain_active(struct net *net, struct nft_chain *chain)
+static void nf_tables_commit_chain(struct net *net, struct nft_chain *chain)
{
struct nft_rule **g0, **g1;
bool next_genbit;
@@ -6363,11 +6363,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
/* step 2. Make rules_gen_X visible to packet path */
list_for_each_entry(table, &net->nft.tables, list) {
- list_for_each_entry(chain, &table->chains, list) {
- if (!nft_is_active_next(net, chain))
- continue;
- nf_tables_commit_chain_active(net, chain);
- }
+ list_for_each_entry(chain, &table->chains, list)
+ nf_tables_commit_chain(net, chain);
}
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 023/123] netfilter: xt_RATEEST: remove netns exit routine
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (4 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 020/123] netfilter: nf_tables: don't skip inactive chains during update Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 024/123] netfilter: nf_tables: fix use-after-free when deleting compat expressions Sasha Levin
` (19 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 0fbcc5b568edab7d848b7c7fa66d44ffbd4133c0 ]
xt_rateest_net_exit() was added to check whether rules are flushed
successfully. but ->net_exit() callback is called earlier than
->destroy() callback.
So that ->net_exit() callback can't check that.
test commands:
%ip netns add vm1
%ip netns exec vm1 iptables -t mangle -I PREROUTING -p udp \
--dport 1111 -j RATEEST --rateest-name ap \
--rateest-interval 250ms --rateest-ewma 0.5s
%ip netns del vm1
splat looks like:
[ 668.813518] WARNING: CPU: 0 PID: 87 at net/netfilter/xt_RATEEST.c:210 xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
[ 668.813518] Modules linked in: xt_RATEEST xt_tcpudp iptable_mangle bpfilter ip_tables x_tables
[ 668.813518] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc7+ #21
[ 668.813518] Workqueue: netns cleanup_net
[ 668.813518] RIP: 0010:xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
[ 668.813518] Code: 00 48 8b 85 30 ff ff ff 4c 8b 23 80 38 00 0f 85 24 01 00 00 48 8b 85 30 ff ff ff 4d 85 e4 4c 89 a5 58 ff ff ff c6 00 f8 74 b2 <0f> 0b 48 83 c3 08 4c 39 f3 75 b0 48 b8 00 00 00 00 00 fc ff df 49
[ 668.813518] RSP: 0018:ffff8801156c73f8 EFLAGS: 00010282
[ 668.813518] RAX: ffffed0022ad8e85 RBX: ffff880118928e98 RCX: 5db8012a00000000
[ 668.813518] RDX: ffff8801156c7428 RSI: 00000000cb1d185f RDI: ffff880115663b74
[ 668.813518] RBP: ffff8801156c74d0 R08: ffff8801156633c0 R09: 1ffff100236440be
[ 668.813518] R10: 0000000000000001 R11: ffffed002367d852 R12: ffff880115142b08
[ 668.813518] R13: 1ffff10022ad8e81 R14: ffff880118928ea8 R15: dffffc0000000000
[ 668.813518] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 668.813518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 668.813518] CR2: 0000563aa69f4f28 CR3: 0000000105a16000 CR4: 00000000001006f0
[ 668.813518] Call Trace:
[ 668.813518] ? unregister_netdevice_many+0xe0/0xe0
[ 668.813518] ? xt_rateest_net_init+0x2c0/0x2c0 [xt_RATEEST]
[ 668.813518] ? default_device_exit+0x1ca/0x270
[ 668.813518] ? remove_proc_entry+0x1cd/0x390
[ 668.813518] ? dev_change_net_namespace+0xd00/0xd00
[ 668.813518] ? __init_waitqueue_head+0x130/0x130
[ 668.813518] ops_exit_list.isra.10+0x94/0x140
[ 668.813518] cleanup_net+0x45b/0x900
[ 668.813518] ? net_drop_ns+0x110/0x110
[ 668.813518] ? swapgs_restore_regs_and_return_to_usermode+0x3c/0x80
[ 668.813518] ? save_trace+0x300/0x300
[ 668.813518] ? lock_acquire+0x196/0x470
[ 668.813518] ? lock_acquire+0x196/0x470
[ 668.813518] ? process_one_work+0xb60/0x1de0
[ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
[ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
[ 668.813518] ? __lock_acquire+0x4500/0x4500
[ 668.813518] ? __lock_is_held+0xb4/0x140
[ 668.813518] process_one_work+0xc13/0x1de0
[ 668.813518] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[ 668.813518] ? set_load_weight+0x270/0x270
[ ... ]
Fixes: 3427b2ab63fa ("netfilter: make xt_rateest hash table per net")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/xt_RATEEST.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
index dec843cadf46..9e05c86ba5c4 100644
--- a/net/netfilter/xt_RATEEST.c
+++ b/net/netfilter/xt_RATEEST.c
@@ -201,18 +201,8 @@ static __net_init int xt_rateest_net_init(struct net *net)
return 0;
}
-static void __net_exit xt_rateest_net_exit(struct net *net)
-{
- struct xt_rateest_net *xn = net_generic(net, xt_rateest_id);
- int i;
-
- for (i = 0; i < ARRAY_SIZE(xn->hash); i++)
- WARN_ON_ONCE(!hlist_empty(&xn->hash[i]));
-}
-
static struct pernet_operations xt_rateest_net_ops = {
.init = xt_rateest_net_init,
- .exit = xt_rateest_net_exit,
.id = &xt_rateest_id,
.size = sizeof(struct xt_rateest_net),
};
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 024/123] netfilter: nf_tables: fix use-after-free when deleting compat expressions
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (5 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 023/123] netfilter: xt_RATEEST: remove netns exit routine Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC Sasha Levin
` (18 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Florian Westphal, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Florian Westphal <fw@strlen.de>
[ Upstream commit 29e3880109e357fdc607b4393f8308cef6af9413 ]
nft_compat ops do not have static storage duration, unlike all other
expressions.
When nf_tables_expr_destroy() returns, expr->ops might have been
free'd already, so we need to store next address before calling
expression destructor.
For same reason, we can't deref match pointer after nft_xt_put().
This can be easily reproduced by adding msleep() before
nft_match_destroy() returns.
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_tables_api.c | 5 +++--
net/netfilter/nft_compat.c | 3 ++-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 4c016b49fe2b..06ed55cef962 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2432,7 +2432,7 @@ static int nf_tables_getrule(struct net *net, struct sock *nlsk,
static void nf_tables_rule_destroy(const struct nft_ctx *ctx,
struct nft_rule *rule)
{
- struct nft_expr *expr;
+ struct nft_expr *expr, *next;
lockdep_assert_held(&ctx->net->nft.commit_mutex);
/*
@@ -2441,8 +2441,9 @@ static void nf_tables_rule_destroy(const struct nft_ctx *ctx,
*/
expr = nft_expr_first(rule);
while (expr != nft_expr_last(rule) && expr->ops) {
+ next = nft_expr_next(expr);
nf_tables_expr_destroy(ctx, expr);
- expr = nft_expr_next(expr);
+ expr = next;
}
kfree(rule);
}
diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index ad2fe6a7e47d..29d6fc73caf9 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -501,6 +501,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr,
void *info)
{
struct xt_match *match = expr->ops->data;
+ struct module *me = match->me;
struct xt_mtdtor_param par;
par.net = ctx->net;
@@ -511,7 +512,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr,
par.match->destroy(&par);
if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops)))
- module_put(match->me);
+ module_put(me);
}
static void
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (6 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 024/123] netfilter: nf_tables: fix use-after-free when deleting compat expressions Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-07 6:40 ` Naresh Kamboju
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 042/123] netfilter: xt_hashlimit: fix a possible memory leak in htable_create() Sasha Levin
` (17 subsequent siblings)
25 siblings, 1 reply; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Roman Gushchin, Roman Gushchin, Alexei Starovoitov,
Daniel Borkmann, Sasha Levin, netdev
From: Roman Gushchin <guroan@gmail.com>
[ Upstream commit 569a933b03f3c48b392fe67c0086b3a6b9306b5a ]
Naresh reported an issue with the non-atomic memory allocation of
cgroup local storage buffers:
[ 73.047526] BUG: sleeping function called from invalid context at
/srv/oe/build/tmp-rpb-glibc/work-shared/intel-corei7-64/kernel-source/mm/slab.h:421
[ 73.060915] in_atomic(): 1, irqs_disabled(): 0, pid: 3157, name: test_cgroup_sto
[ 73.068342] INFO: lockdep is turned off.
[ 73.072293] CPU: 2 PID: 3157 Comm: test_cgroup_sto Not tainted
4.20.0-rc2-next-20181113 #1
[ 73.080548] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 73.088018] Call Trace:
[ 73.090463] dump_stack+0x70/0xa5
[ 73.093783] ___might_sleep+0x152/0x240
[ 73.097619] __might_sleep+0x4a/0x80
[ 73.101191] __kmalloc_node+0x1cf/0x2f0
[ 73.105031] ? cgroup_storage_update_elem+0x46/0x90
[ 73.109909] cgroup_storage_update_elem+0x46/0x90
cgroup_storage_update_elem() (as well as other update map update
callbacks) is called with disabled preemption, so GFP_ATOMIC
allocation should be used: e.g. alloc_htab_elem() in hashtab.c.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/bpf/local_storage.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
index 830d7f095748..fc1605aee5ea 100644
--- a/kernel/bpf/local_storage.c
+++ b/kernel/bpf/local_storage.c
@@ -138,7 +138,8 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *_key,
return -ENOENT;
new = kmalloc_node(sizeof(struct bpf_storage_buffer) +
- map->value_size, __GFP_ZERO | GFP_USER,
+ map->value_size,
+ __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN,
map->numa_node);
if (!new)
return -ENOMEM;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 042/123] netfilter: xt_hashlimit: fix a possible memory leak in htable_create()
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (7 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 058/123] tools: bpftool: fix potential NULL pointer dereference in do_load Sasha Levin
` (16 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit b4e955e9f372035361fbc6f07b21fe2cc6a5be4a ]
In the htable_create(), hinfo is allocated by vmalloc()
So that if error occurred, hinfo should be freed.
Fixes: 11d5f15723c9 ("netfilter: xt_hashlimit: Create revision 2 to support higher pps rates")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/xt_hashlimit.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 3e7d259e5d8d..1ad4017f9b73 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -295,9 +295,10 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
/* copy match config into hashtable config */
ret = cfg_copy(&hinfo->cfg, (void *)cfg, 3);
-
- if (ret)
+ if (ret) {
+ vfree(hinfo);
return ret;
+ }
hinfo->cfg.size = size;
if (hinfo->cfg.max == 0)
@@ -814,7 +815,6 @@ hashlimit_mt_v1(const struct sk_buff *skb, struct xt_action_param *par)
int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
-
if (ret)
return ret;
@@ -830,7 +830,6 @@ hashlimit_mt_v2(const struct sk_buff *skb, struct xt_action_param *par)
int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2);
-
if (ret)
return ret;
@@ -921,7 +920,6 @@ static int hashlimit_mt_check_v1(const struct xt_mtchk_param *par)
return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
-
if (ret)
return ret;
@@ -940,7 +938,6 @@ static int hashlimit_mt_check_v2(const struct xt_mtchk_param *par)
return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2);
-
if (ret)
return ret;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 058/123] tools: bpftool: fix potential NULL pointer dereference in do_load
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (8 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 042/123] netfilter: xt_hashlimit: fix a possible memory leak in htable_create() Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 065/123] bpf: fix check of allowed specifiers in bpf_trace_printk Sasha Levin
` (15 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Jakub Kicinski, Wen Yang, Daniel Borkmann, Sasha Levin, netdev
From: Jakub Kicinski <jakub.kicinski@netronome.com>
[ Upstream commit dde7011a824cfa815b03f853ec985ff46b740939 ]
This patch fixes a possible null pointer dereference in
do_load, detected by the semantic patch deref_null.cocci,
with the following warning:
./tools/bpf/bpftool/prog.c:1021:23-25: ERROR: map_replace is NULL but dereferenced.
The following code has potential null pointer references:
881 map_replace = reallocarray(map_replace, old_map_fds + 1,
882 sizeof(*map_replace));
883 if (!map_replace) {
884 p_err("mem alloc failed");
885 goto err_free_reuse_maps;
886 }
...
1019 err_free_reuse_maps:
1020 for (i = 0; i < old_map_fds; i++)
1021 close(map_replace[i].fd);
1022 free(map_replace);
Fixes: 3ff5a4dc5d89 ("tools: bpftool: allow reuse of maps with bpftool prog load")
Co-developed-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/bpf/bpftool/prog.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index dce960d22106..0de024a6cc2b 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -749,6 +749,7 @@ static int do_load(int argc, char **argv)
}
NEXT_ARG();
} else if (is_prefix(*argv, "map")) {
+ void *new_map_replace;
char *endptr, *name;
int fd;
@@ -782,12 +783,15 @@ static int do_load(int argc, char **argv)
if (fd < 0)
goto err_free_reuse_maps;
- map_replace = reallocarray(map_replace, old_map_fds + 1,
- sizeof(*map_replace));
- if (!map_replace) {
+ new_map_replace = reallocarray(map_replace,
+ old_map_fds + 1,
+ sizeof(*map_replace));
+ if (!new_map_replace) {
p_err("mem alloc failed");
goto err_free_reuse_maps;
}
+ map_replace = new_map_replace;
+
map_replace[old_map_fds].idx = idx;
map_replace[old_map_fds].name = name;
map_replace[old_map_fds].fd = fd;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 065/123] bpf: fix check of allowed specifiers in bpf_trace_printk
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (9 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 058/123] tools: bpftool: fix potential NULL pointer dereference in do_load Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 067/123] ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf Sasha Levin
` (14 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Martynas Pumputis, Daniel Borkmann, Sasha Levin, netdev
From: Martynas Pumputis <m@lambda.lt>
[ Upstream commit 1efb6ee3edea57f57f9fb05dba8dcb3f7333f61f ]
A format string consisting of "%p" or "%s" followed by an invalid
specifier (e.g. "%p%\n" or "%s%") could pass the check which
would make format_decode (lib/vsprintf.c) to warn.
Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/trace/bpf_trace.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 08fcfe440c63..9864a35c8bb5 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -196,11 +196,13 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
i++;
} else if (fmt[i] == 'p' || fmt[i] == 's') {
mod[fmt_cnt]++;
- i++;
- if (!isspace(fmt[i]) && !ispunct(fmt[i]) && fmt[i] != 0)
+ /* disallow any further format extensions */
+ if (fmt[i + 1] != 0 &&
+ !isspace(fmt[i + 1]) &&
+ !ispunct(fmt[i + 1]))
return -EINVAL;
fmt_cnt++;
- if (fmt[i - 1] == 's') {
+ if (fmt[i] == 's') {
if (str_seen)
/* allow only one '%s' per fmt string */
return -EINVAL;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 067/123] ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (10 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 065/123] bpf: fix check of allowed specifiers in bpf_trace_printk Sasha Levin
@ 2018-12-05 9:34 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 075/123] netfilter: ipv6: Preserve link scope traffic original oif Sasha Levin
` (13 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:34 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Xin Long, Pablo Neira Ayuso, Sasha Levin, netdev, lvs-devel,
netfilter-devel, coreteam
From: Xin Long <lucien.xin@gmail.com>
[ Upstream commit 2a31e4bd9ad255ee40809b5c798c4b1c2b09703b ]
ip_vs_dst_event is supposed to clean up all dst used in ipvs'
destinations when a net dev is going down. But it works only
when the dst's dev is the same as the dev from the event.
Now with the same priority but late registration,
ip_vs_dst_notifier is always called later than ipv6_dev_notf
where the dst's dev is set to lo for NETDEV_DOWN event.
As the dst's dev lo is not the same as the dev from the event
in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
Also as these dst have to wait for dest_trash_timer to clean
them up. It would cause some non-permanent kernel warnings:
unregister_netdevice: waiting for br0 to become free. Usage count = 3
To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.
Note that for ipv4 route fib_netdev_notifier doesn't set dst's
dev to lo in NETDEV_DOWN event, so this fix is only needed when
IP_VS_IPV6 is defined.
Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/ipvs/ip_vs_ctl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 62eefea48973..518364f4abcc 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3980,6 +3980,9 @@ static void __net_exit ip_vs_control_net_cleanup_sysctl(struct netns_ipvs *ipvs)
static struct notifier_block ip_vs_dst_notifier = {
.notifier_call = ip_vs_dst_event,
+#ifdef CONFIG_IP_VS_IPV6
+ .priority = ADDRCONF_NOTIFY_PRIORITY + 5,
+#endif
};
int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs)
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 075/123] netfilter: ipv6: Preserve link scope traffic original oif
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (11 preceding siblings ...)
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 067/123] ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 077/123] netfilter: add missing error handling code for register functions Sasha Levin
` (12 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Alin Nastac, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Alin Nastac <alin.nastac@gmail.com>
[ Upstream commit 508b09046c0f21678652fb66fd1e9959d55591d2 ]
When ip6_route_me_harder is invoked, it resets outgoing interface of:
- link-local scoped packets sent by neighbor discovery
- multicast packets sent by MLD host
- multicast packets send by MLD proxy daemon that sets outgoing
interface through IPV6_PKTINFO ipi6_ifindex
Link-local and multicast packets must keep their original oif after
ip6_route_me_harder is called.
Signed-off-by: Alin Nastac <alin.nastac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/netfilter.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 5ae8e1c51079..8b075f0bc351 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff *skb)
unsigned int hh_len;
struct dst_entry *dst;
struct flowi6 fl6 = {
- .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
+ .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
+ rt6_need_strict(&iph->daddr) ? skb_dst(skb)->dev->ifindex : 0,
.flowi6_mark = skb->mark,
.flowi6_uid = sock_net_uid(net, sk),
.daddr = iph->daddr,
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 077/123] netfilter: add missing error handling code for register functions
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (12 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 075/123] netfilter: ipv6: Preserve link scope traffic original oif Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 078/123] netfilter: nat: fix double register in masquerade modules Sasha Levin
` (11 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 584eab291c67894cb17cc87544b9d086228ea70f ]
register_{netdevice/inetaddr/inet6addr}_notifier may return an error
value, this patch adds the code to handle these error paths.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../net/netfilter/ipv4/nf_nat_masquerade.h | 2 +-
.../net/netfilter/ipv6/nf_nat_masquerade.h | 2 +-
net/ipv4/netfilter/ipt_MASQUERADE.c | 7 ++--
net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 21 +++++++++---
net/ipv4/netfilter/nft_masq_ipv4.c | 4 ++-
net/ipv6/netfilter/ip6t_MASQUERADE.c | 8 +++--
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 32 +++++++++++++------
net/ipv6/netfilter/nft_masq_ipv6.c | 4 ++-
net/netfilter/nft_flow_offload.c | 5 ++-
9 files changed, 63 insertions(+), 22 deletions(-)
diff --git a/include/net/netfilter/ipv4/nf_nat_masquerade.h b/include/net/netfilter/ipv4/nf_nat_masquerade.h
index cd24be4c4a99..13d55206bb9f 100644
--- a/include/net/netfilter/ipv4/nf_nat_masquerade.h
+++ b/include/net/netfilter/ipv4/nf_nat_masquerade.h
@@ -9,7 +9,7 @@ nf_nat_masquerade_ipv4(struct sk_buff *skb, unsigned int hooknum,
const struct nf_nat_range2 *range,
const struct net_device *out);
-void nf_nat_masquerade_ipv4_register_notifier(void);
+int nf_nat_masquerade_ipv4_register_notifier(void);
void nf_nat_masquerade_ipv4_unregister_notifier(void);
#endif /*_NF_NAT_MASQUERADE_IPV4_H_ */
diff --git a/include/net/netfilter/ipv6/nf_nat_masquerade.h b/include/net/netfilter/ipv6/nf_nat_masquerade.h
index 0c3b5ebf0bb8..2917bf95c437 100644
--- a/include/net/netfilter/ipv6/nf_nat_masquerade.h
+++ b/include/net/netfilter/ipv6/nf_nat_masquerade.h
@@ -5,7 +5,7 @@
unsigned int
nf_nat_masquerade_ipv6(struct sk_buff *skb, const struct nf_nat_range2 *range,
const struct net_device *out);
-void nf_nat_masquerade_ipv6_register_notifier(void);
+int nf_nat_masquerade_ipv6_register_notifier(void);
void nf_nat_masquerade_ipv6_unregister_notifier(void);
#endif /* _NF_NAT_MASQUERADE_IPV6_H_ */
diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c
index ce1512b02cb2..fd3f9e8a74da 100644
--- a/net/ipv4/netfilter/ipt_MASQUERADE.c
+++ b/net/ipv4/netfilter/ipt_MASQUERADE.c
@@ -81,9 +81,12 @@ static int __init masquerade_tg_init(void)
int ret;
ret = xt_register_target(&masquerade_tg_reg);
+ if (ret)
+ return ret;
- if (ret == 0)
- nf_nat_masquerade_ipv4_register_notifier();
+ ret = nf_nat_masquerade_ipv4_register_notifier();
+ if (ret)
+ xt_unregister_target(&masquerade_tg_reg);
return ret;
}
diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index ad3aeff152ed..4a7c1f207d6e 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -133,16 +133,29 @@ static struct notifier_block masq_inet_notifier = {
static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
-void nf_nat_masquerade_ipv4_register_notifier(void)
+int nf_nat_masquerade_ipv4_register_notifier(void)
{
+ int ret;
+
/* check if the notifier was already set */
if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return;
+ return 0;
/* Register for device down reports */
- register_netdevice_notifier(&masq_dev_notifier);
+ ret = register_netdevice_notifier(&masq_dev_notifier);
+ if (ret)
+ goto err_dec;
/* Register IP address change reports */
- register_inetaddr_notifier(&masq_inet_notifier);
+ ret = register_inetaddr_notifier(&masq_inet_notifier);
+ if (ret)
+ goto err_unregister;
+
+ return ret;
+err_unregister:
+ unregister_netdevice_notifier(&masq_dev_notifier);
+err_dec:
+ atomic_dec(&masquerade_notifier_refcount);
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier);
diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c b/net/ipv4/netfilter/nft_masq_ipv4.c
index f1193e1e928a..6847de1d1db8 100644
--- a/net/ipv4/netfilter/nft_masq_ipv4.c
+++ b/net/ipv4/netfilter/nft_masq_ipv4.c
@@ -69,7 +69,9 @@ static int __init nft_masq_ipv4_module_init(void)
if (ret < 0)
return ret;
- nf_nat_masquerade_ipv4_register_notifier();
+ ret = nf_nat_masquerade_ipv4_register_notifier();
+ if (ret)
+ nft_unregister_expr(&nft_masq_ipv4_type);
return ret;
}
diff --git a/net/ipv6/netfilter/ip6t_MASQUERADE.c b/net/ipv6/netfilter/ip6t_MASQUERADE.c
index 491f808e356a..29c7f1915a96 100644
--- a/net/ipv6/netfilter/ip6t_MASQUERADE.c
+++ b/net/ipv6/netfilter/ip6t_MASQUERADE.c
@@ -58,8 +58,12 @@ static int __init masquerade_tg6_init(void)
int err;
err = xt_register_target(&masquerade_tg6_reg);
- if (err == 0)
- nf_nat_masquerade_ipv6_register_notifier();
+ if (err)
+ return err;
+
+ err = nf_nat_masquerade_ipv6_register_notifier();
+ if (err)
+ xt_unregister_target(&masquerade_tg6_reg);
return err;
}
diff --git a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
index e6eb7cf9b54f..10012fc687b6 100644
--- a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
@@ -120,8 +120,8 @@ static void iterate_cleanup_work(struct work_struct *work)
* of ipv6 addresses being deleted), we also need to add an upper
* limit to the number of queued work items.
*/
-static int masq_inet_event(struct notifier_block *this,
- unsigned long event, void *ptr)
+static int masq_inet6_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
{
struct inet6_ifaddr *ifa = ptr;
const struct net_device *dev;
@@ -158,20 +158,34 @@ static int masq_inet_event(struct notifier_block *this,
return NOTIFY_DONE;
}
-static struct notifier_block masq_inet_notifier = {
- .notifier_call = masq_inet_event,
+static struct notifier_block masq_inet6_notifier = {
+ .notifier_call = masq_inet6_event,
};
static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
-void nf_nat_masquerade_ipv6_register_notifier(void)
+int nf_nat_masquerade_ipv6_register_notifier(void)
{
+ int ret;
+
/* check if the notifier is already set */
if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return;
+ return 0;
- register_netdevice_notifier(&masq_dev_notifier);
- register_inet6addr_notifier(&masq_inet_notifier);
+ ret = register_netdevice_notifier(&masq_dev_notifier);
+ if (ret)
+ goto err_dec;
+
+ ret = register_inet6addr_notifier(&masq_inet6_notifier);
+ if (ret)
+ goto err_unregister;
+
+ return ret;
+err_unregister:
+ unregister_netdevice_notifier(&masq_dev_notifier);
+err_dec:
+ atomic_dec(&masquerade_notifier_refcount);
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_register_notifier);
@@ -181,7 +195,7 @@ void nf_nat_masquerade_ipv6_unregister_notifier(void)
if (atomic_dec_return(&masquerade_notifier_refcount) > 0)
return;
- unregister_inet6addr_notifier(&masq_inet_notifier);
+ unregister_inet6addr_notifier(&masq_inet6_notifier);
unregister_netdevice_notifier(&masq_dev_notifier);
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_unregister_notifier);
diff --git a/net/ipv6/netfilter/nft_masq_ipv6.c b/net/ipv6/netfilter/nft_masq_ipv6.c
index dd0122f3cffe..e06c82e9dfcd 100644
--- a/net/ipv6/netfilter/nft_masq_ipv6.c
+++ b/net/ipv6/netfilter/nft_masq_ipv6.c
@@ -70,7 +70,9 @@ static int __init nft_masq_ipv6_module_init(void)
if (ret < 0)
return ret;
- nf_nat_masquerade_ipv6_register_notifier();
+ ret = nf_nat_masquerade_ipv6_register_notifier();
+ if (ret)
+ nft_unregister_expr(&nft_masq_ipv6_type);
return ret;
}
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index d6bab8c3cbb0..5fd4c57c79cc 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -214,7 +214,9 @@ static int __init nft_flow_offload_module_init(void)
{
int err;
- register_netdevice_notifier(&flow_offload_netdev_notifier);
+ err = register_netdevice_notifier(&flow_offload_netdev_notifier);
+ if (err)
+ goto err;
err = nft_register_expr(&nft_flow_offload_type);
if (err < 0)
@@ -224,6 +226,7 @@ static int __init nft_flow_offload_module_init(void)
register_expr:
unregister_netdevice_notifier(&flow_offload_netdev_notifier);
+err:
return err;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 078/123] netfilter: nat: fix double register in masquerade modules
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (13 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 077/123] netfilter: add missing error handling code for register functions Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 079/123] netfilter: nf_conncount: remove wrong condition check routine Sasha Levin
` (10 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 095faf45e64be00bff4da2d6182dface3d69c9b7 ]
There is a reference counter to ensure that masquerade modules register
notifiers only once. However, the existing reference counter approach is
not safe, test commands are:
while :
do
modprobe ip6t_MASQUERADE &
modprobe nft_masq_ipv6 &
modprobe -rv ip6t_MASQUERADE &
modprobe -rv nft_masq_ipv6 &
done
numbers below represent the reference counter.
--------------------------------------------------------
CPU0 CPU1 CPU2 CPU3 CPU4
[insmod] [insmod] [rmmod] [rmmod] [insmod]
--------------------------------------------------------
0->1
register 1->2
returns 2->1
returns 1->0
0->1
register <--
unregister
--------------------------------------------------------
The unregistation of CPU3 should be processed before the
registration of CPU4.
In order to fix this, use a mutex instead of reference counter.
splat looks like:
[ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
[ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
[ 323.869574] irq event stamp: 194074
[ 323.898930] hardirqs last enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c
[ 323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 323.898930] softirqs last enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b
[ 323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0
[ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
[ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
[ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488
[ 323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000
[ 323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0
[ 323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001
[ 323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000
[ 323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0
[ 323.898930] FS: 00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[ 323.898930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0
[ 323.898930] Call Trace:
[ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0
[ 323.898930] ? down_read+0x150/0x150
[ 323.898930] ? sched_clock_cpu+0x126/0x170
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] register_netdevice_notifier+0xbb/0x790
[ 323.898930] ? __dev_close_many+0x2d0/0x2d0
[ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740
[ 323.898930] ? wait_for_completion+0x710/0x710
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? up_write+0x6c/0x210
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables]
[ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables]
[ ... ]
Fixes: 8dd33cc93ec9 ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables")
Fixes: be6b635cd674 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 23 ++++++++++++++-------
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 23 ++++++++++++++-------
2 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index 4a7c1f207d6e..4c7fcd32f8e6 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -131,15 +131,17 @@ static struct notifier_block masq_inet_notifier = {
.notifier_call = masq_inet_event,
};
-static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
+static int masq_refcnt;
+static DEFINE_MUTEX(masq_mutex);
int nf_nat_masquerade_ipv4_register_notifier(void)
{
- int ret;
+ int ret = 0;
+ mutex_lock(&masq_mutex);
/* check if the notifier was already set */
- if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return 0;
+ if (++masq_refcnt > 1)
+ goto out_unlock;
/* Register for device down reports */
ret = register_netdevice_notifier(&masq_dev_notifier);
@@ -150,22 +152,29 @@ int nf_nat_masquerade_ipv4_register_notifier(void)
if (ret)
goto err_unregister;
+ mutex_unlock(&masq_mutex);
return ret;
+
err_unregister:
unregister_netdevice_notifier(&masq_dev_notifier);
err_dec:
- atomic_dec(&masquerade_notifier_refcount);
+ masq_refcnt--;
+out_unlock:
+ mutex_unlock(&masq_mutex);
return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier);
void nf_nat_masquerade_ipv4_unregister_notifier(void)
{
+ mutex_lock(&masq_mutex);
/* check if the notifier still has clients */
- if (atomic_dec_return(&masquerade_notifier_refcount) > 0)
- return;
+ if (--masq_refcnt > 0)
+ goto out_unlock;
unregister_netdevice_notifier(&masq_dev_notifier);
unregister_inetaddr_notifier(&masq_inet_notifier);
+out_unlock:
+ mutex_unlock(&masq_mutex);
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_unregister_notifier);
diff --git a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
index 10012fc687b6..37b1d413c825 100644
--- a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
@@ -162,15 +162,17 @@ static struct notifier_block masq_inet6_notifier = {
.notifier_call = masq_inet6_event,
};
-static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
+static int masq_refcnt;
+static DEFINE_MUTEX(masq_mutex);
int nf_nat_masquerade_ipv6_register_notifier(void)
{
- int ret;
+ int ret = 0;
+ mutex_lock(&masq_mutex);
/* check if the notifier is already set */
- if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return 0;
+ if (++masq_refcnt > 1)
+ goto out_unlock;
ret = register_netdevice_notifier(&masq_dev_notifier);
if (ret)
@@ -180,22 +182,29 @@ int nf_nat_masquerade_ipv6_register_notifier(void)
if (ret)
goto err_unregister;
+ mutex_unlock(&masq_mutex);
return ret;
+
err_unregister:
unregister_netdevice_notifier(&masq_dev_notifier);
err_dec:
- atomic_dec(&masquerade_notifier_refcount);
+ masq_refcnt--;
+out_unlock:
+ mutex_unlock(&masq_mutex);
return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_register_notifier);
void nf_nat_masquerade_ipv6_unregister_notifier(void)
{
+ mutex_lock(&masq_mutex);
/* check if the notifier still has clients */
- if (atomic_dec_return(&masquerade_notifier_refcount) > 0)
- return;
+ if (--masq_refcnt > 0)
+ goto out_unlock;
unregister_inet6addr_notifier(&masq_inet6_notifier);
unregister_netdevice_notifier(&masq_dev_notifier);
+out_unlock:
+ mutex_unlock(&masq_mutex);
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_unregister_notifier);
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 079/123] netfilter: nf_conncount: remove wrong condition check routine
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (14 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 078/123] netfilter: nat: fix double register in masquerade modules Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 083/123] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Sasha Levin
` (9 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 53ca0f2fec39c80ccd19e6e3f30cc8daef174b70 ]
All lists that reach the tree_nodes_free() function have both zero
counter and true dead flag. The reason for this is that lists to be
release are selected by nf_conncount_gc_list() which already decrements
the list counter and sets on the dead flag. Therefore, this if statement
in tree_nodes_free() is unnecessary and wrong.
Fixes: 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_conncount.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 8acae4a3e4c0..b6d0f6deea86 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -323,11 +323,8 @@ static void tree_nodes_free(struct rb_root *root,
while (gc_count) {
rbconn = gc_nodes[--gc_count];
spin_lock(&rbconn->list.list_lock);
- if (rbconn->list.count == 0 && rbconn->list.dead == false) {
- rbconn->list.dead = true;
- rb_erase(&rbconn->node, root);
- call_rcu(&rbconn->rcu_head, __tree_nodes_free);
- }
+ rb_erase(&rbconn->node, root);
+ call_rcu(&rbconn->rcu_head, __tree_nodes_free);
spin_unlock(&rbconn->list.list_lock);
}
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 083/123] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (15 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 079/123] netfilter: nf_conncount: remove wrong condition check routine Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 084/123] net: phy: add workaround for issue where PHY driver doesn't bind to the device Sasha Levin
` (8 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Bernd Eckstein, Oliver Zweigle, Bernd Eckstein, David S . Miller,
Sasha Levin, linux-usb, netdev
From: Bernd Eckstein <3erndeckstein@gmail.com>
[ Upstream commit 45611c61dd503454b2edae00aabe1e429ec49ebe ]
The bug is not easily reproducable, as it may occur very infrequently
(we had machines with 20minutes heavy downloading before it occurred)
However, on a virual machine (VMWare on Windows 10 host) it occurred
pretty frequently (1-2 seconds after a speedtest was started)
dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback
before it is set.
This causes the following problems:
- double free of the skb or potential memory leak
- in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually
general protection fault
Example dmesg output:
[ 134.841986] ------------[ cut here ]------------
[ 134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0
[ 134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0
[ 134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[ 134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu
[ 134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[ 134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0
[ 134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286
[ 134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006
[ 134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490
[ 134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4
[ 134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000
[ 134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000
[ 134.842051] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[ 134.842052] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[ 134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 134.842057] Call Trace:
[ 134.842060] ? aa_sk_perm+0x53/0x1a0
[ 134.842064] inet_recvmsg+0x51/0xc0
[ 134.842066] sock_recvmsg+0x43/0x50
[ 134.842070] SYSC_recvfrom+0xe4/0x160
[ 134.842072] ? __schedule+0x3de/0x8b0
[ 134.842075] ? ktime_get_ts64+0x4c/0xf0
[ 134.842079] SyS_recvfrom+0xe/0x10
[ 134.842082] do_syscall_64+0x73/0x130
[ 134.842086] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 134.842086] RIP: 0033:0x7fe331f5a81d
[ 134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[ 134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d
[ 134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003
[ 134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000
[ 134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698
[ 134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000
[ 134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00
[ 134.842126] ---[ end trace b7138fc08c83147f ]---
[ 134.842144] general protection fault: 0000 [#1] SMP PTI
[ 134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[ 134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu
[ 134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[ 134.842164] RIP: 0010:tcp_close+0x2c6/0x440
[ 134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202
[ 134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034
[ 134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200
[ 134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034
[ 134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c
[ 134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00
[ 134.842171] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[ 134.842173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[ 134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 134.842179] Call Trace:
[ 134.842181] inet_release+0x42/0x70
[ 134.842183] __sock_release+0x42/0xb0
[ 134.842184] sock_close+0x15/0x20
[ 134.842187] __fput+0xea/0x220
[ 134.842189] ____fput+0xe/0x10
[ 134.842191] task_work_run+0x8a/0xb0
[ 134.842193] exit_to_usermode_loop+0xc4/0xd0
[ 134.842195] do_syscall_64+0xf4/0x130
[ 134.842197] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 134.842197] RIP: 0033:0x7fe331f5a560
[ 134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560
[ 134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003
[ 134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0
[ 134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0
[ 134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770
[ 134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80
[ 134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8
[ 134.842227] ---[ end trace b7138fc08c831480 ]---
The proposed patch eliminates a potential racing condition.
Before, usb_submit_urb was called and _after_ that, the skb was attached
(dev->tx_skb). So, on a callback it was possible, however unlikely that the
skb was freed before it was set. That way (because dev->tx_skb was not set
to NULL after it was freed), it could happen that a skb from a earlier
transmission was freed a second time (and the skb we should have freed did
not get freed at all)
Now we free the skb directly in ipheth_tx(). It is not passed to the
callback anymore, eliminating the posibility of a double free of the same
skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any()
respectively dev_consume_skb_any() to free the skb.
Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com>
Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/usb/ipheth.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c
index 7275761a1177..3d8a70d3ea9b 100644
--- a/drivers/net/usb/ipheth.c
+++ b/drivers/net/usb/ipheth.c
@@ -140,7 +140,6 @@ struct ipheth_device {
struct usb_device *udev;
struct usb_interface *intf;
struct net_device *net;
- struct sk_buff *tx_skb;
struct urb *tx_urb;
struct urb *rx_urb;
unsigned char *tx_buf;
@@ -230,6 +229,7 @@ static void ipheth_rcvbulk_callback(struct urb *urb)
case -ENOENT:
case -ECONNRESET:
case -ESHUTDOWN:
+ case -EPROTO:
return;
case 0:
break;
@@ -281,7 +281,6 @@ static void ipheth_sndbulk_callback(struct urb *urb)
dev_err(&dev->intf->dev, "%s: urb status: %d\n",
__func__, status);
- dev_kfree_skb_irq(dev->tx_skb);
if (status == 0)
netif_wake_queue(dev->net);
else
@@ -423,7 +422,7 @@ static int ipheth_tx(struct sk_buff *skb, struct net_device *net)
if (skb->len > IPHETH_BUF_SIZE) {
WARN(1, "%s: skb too large: %d bytes\n", __func__, skb->len);
dev->net->stats.tx_dropped++;
- dev_kfree_skb_irq(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}
@@ -443,12 +442,11 @@ static int ipheth_tx(struct sk_buff *skb, struct net_device *net)
dev_err(&dev->intf->dev, "%s: usb_submit_urb: %d\n",
__func__, retval);
dev->net->stats.tx_errors++;
- dev_kfree_skb_irq(skb);
+ dev_kfree_skb_any(skb);
} else {
- dev->tx_skb = skb;
-
dev->net->stats.tx_packets++;
dev->net->stats.tx_bytes += skb->len;
+ dev_consume_skb_any(skb);
netif_stop_queue(net);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 084/123] net: phy: add workaround for issue where PHY driver doesn't bind to the device
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (16 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 083/123] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 085/123] net: thunderx: fix NULL pointer dereference in nic_remove Sasha Levin
` (7 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Heiner Kallweit, David S . Miller, Sasha Levin, netdev
From: Heiner Kallweit <hkallweit1@gmail.com>
[ Upstream commit c85ddecae6e5e82ca3ae6f20c63f1d865e2ff5ea ]
After switching the r8169 driver to use phylib some user reported that
their network is broken. This was caused by the genphy PHY driver being
used instead of the dedicated PHY driver for the RTL8211B. Users
reported that loading the Realtek PHY driver module upfront fixes the
issue. See also this mail thread:
https://marc.info/?t=154279781800003&r=1&w=2
The issue is quite weird and the root cause seems to be somewhere in
the base driver core. The patch works around the issue and may be
removed once the actual issue is fixed.
The Fixes tag refers to the first reported occurrence of the issue.
The issue itself may have been existing much longer and it may affect
users of other network chips as well. Users typically will recognize
this issue only if their PHY stops working when being used with the
genphy driver.
Fixes: f1e911d5d0df ("r8169: add basic phylib support")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/phy/phy_device.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 19ab8a7d1e48..733e35b7c4bb 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1930,6 +1930,14 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
new_driver->mdiodrv.driver.remove = phy_remove;
new_driver->mdiodrv.driver.owner = owner;
+ /* The following works around an issue where the PHY driver doesn't bind
+ * to the device, resulting in the genphy driver being used instead of
+ * the dedicated driver. The root cause of the issue isn't known yet
+ * and seems to be in the base driver core. Once this is fixed we may
+ * remove this workaround.
+ */
+ new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS;
+
retval = driver_register(&new_driver->mdiodrv.driver);
if (retval) {
pr_err("%s: Error %d in registering driver\n",
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 085/123] net: thunderx: fix NULL pointer dereference in nic_remove
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (17 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 084/123] net: phy: add workaround for issue where PHY driver doesn't bind to the device Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 086/123] lan743x: fix return value for lan743x_tx_napi_poll Sasha Levin
` (6 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Lorenzo Bianconi, David S . Miller, Sasha Levin, netdev
From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
[ Upstream commit 24a6d2dd263bc910de018c78d1148b3e33b94512 ]
Fix a possible NULL pointer dereference in nic_remove routine
removing the nicpf module if nic_probe fails.
The issue can be triggered with the following reproducer:
$rmmod nicvf
$rmmod nicpf
[ 521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014
[ 521.422777] Mem abort info:
[ 521.425561] ESR = 0x96000004
[ 521.428624] Exception class = DABT (current EL), IL = 32 bits
[ 521.434535] SET = 0, FnV = 0
[ 521.437579] EA = 0, S1PTW = 0
[ 521.440730] Data abort info:
[ 521.443603] ISV = 0, ISS = 0x00000004
[ 521.447431] CM = 0, WnR = 0
[ 521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42
[ 521.457022] [0000000000000014] pgd=0000000000000000
[ 521.461916] Internal error: Oops: 96000004 [#1] SMP
[ 521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[ 521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO)
[ 521.523451] pc : nic_remove+0x24/0x88 [nicpf]
[ 521.527808] lr : pci_device_remove+0x48/0xd8
[ 521.532066] sp : ffff000013433cc0
[ 521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000
[ 521.540672] x27: 0000000000000000 x26: 0000000000000000
[ 521.545974] x25: 0000000056000000 x24: 0000000000000015
[ 521.551274] x23: ffff8007ff89a110 x22: ffff000001667070
[ 521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000
[ 521.561877] x19: 0000000000000000 x18: 0000000000000025
[ 521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000
[ 521.593683] x7 : 0000000000000000 x6 : 0000000000000001
[ 521.598983] x5 : 0000000000000002 x4 : 0000000000000003
[ 521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184
[ 521.609585] x1 : ffff000001662118 x0 : ffff000008557be0
[ 521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3)
[ 521.621490] Call trace:
[ 521.623928] nic_remove+0x24/0x88 [nicpf]
[ 521.627927] pci_device_remove+0x48/0xd8
[ 521.631847] device_release_driver_internal+0x1b0/0x248
[ 521.637062] driver_detach+0x50/0xc0
[ 521.640628] bus_remove_driver+0x60/0x100
[ 521.644627] driver_unregister+0x34/0x60
[ 521.648538] pci_unregister_driver+0x24/0xd8
[ 521.652798] nic_cleanup_module+0x14/0x111c [nicpf]
[ 521.657672] __arm64_sys_delete_module+0x150/0x218
[ 521.662460] el0_svc_handler+0x94/0x110
[ 521.666287] el0_svc+0x8/0xc
[ 521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 55af04fa03a7..6c8dcb65ff03 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1441,6 +1441,9 @@ static void nic_remove(struct pci_dev *pdev)
{
struct nicpf *nic = pci_get_drvdata(pdev);
+ if (!nic)
+ return;
+
if (nic->flags & NIC_SRIOV_ENABLED)
pci_disable_sriov(pdev);
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 086/123] lan743x: fix return value for lan743x_tx_napi_poll
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (18 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 085/123] net: thunderx: fix NULL pointer dereference in nic_remove Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 087/123] lan743x: Enable driver to work with LAN7431 Sasha Levin
` (5 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Bryan Whitehead, David S . Miller, Sasha Levin, netdev
From: Bryan Whitehead <Bryan.Whitehead@microchip.com>
[ Upstream commit cc5922054131f9abefdc0622ae64fc55e6b2671d ]
The lan743x driver, when under heavy traffic load, has been noticed
to sometimes hang, or cause a kernel panic.
Debugging reveals that the TX napi poll routine was returning
the wrong value, 'weight'. Most other drivers return 0.
And call napi_complete, instead of napi_complete_done.
Additionally when creating the tx napi poll routine.
Changed netif_napi_add, to netif_tx_napi_add.
Updates for v3:
changed 'fixes' tag to match defined format
Updates for v2:
use napi_complete, instead of napi_complete_done in
lan743x_tx_napi_poll
use netif_tx_napi_add, instead of netif_napi_add for
registration of tx napi poll routine
fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/microchip/lan743x_main.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 001b5f714c1b..5c3ac70abeca 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1675,7 +1675,7 @@ static int lan743x_tx_napi_poll(struct napi_struct *napi, int weight)
netif_wake_queue(adapter->netdev);
}
- if (!napi_complete_done(napi, weight))
+ if (!napi_complete(napi))
goto done;
/* enable isr */
@@ -1684,7 +1684,7 @@ static int lan743x_tx_napi_poll(struct napi_struct *napi, int weight)
lan743x_csr_read(adapter, INT_STS);
done:
- return weight;
+ return 0;
}
static void lan743x_tx_ring_cleanup(struct lan743x_tx *tx)
@@ -1873,9 +1873,9 @@ static int lan743x_tx_open(struct lan743x_tx *tx)
tx->vector_flags = lan743x_intr_get_vector_flags(adapter,
INT_BIT_DMA_TX_
(tx->channel_number));
- netif_napi_add(adapter->netdev,
- &tx->napi, lan743x_tx_napi_poll,
- tx->ring_size - 1);
+ netif_tx_napi_add(adapter->netdev,
+ &tx->napi, lan743x_tx_napi_poll,
+ tx->ring_size - 1);
napi_enable(&tx->napi);
data = 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 087/123] lan743x: Enable driver to work with LAN7431
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (19 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 086/123] lan743x: fix return value for lan743x_tx_napi_poll Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 089/123] netfilter: nf_tables: deactivate expressions in rule replecement routine Sasha Levin
` (4 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Bryan Whitehead, David S . Miller, Sasha Levin, netdev
From: Bryan Whitehead <Bryan.Whitehead@microchip.com>
[ Upstream commit 4df5ce9bc03e47d05f400e64aa32a82ec4cef419 ]
This driver was designed to work with both LAN7430 and LAN7431.
The only difference between the two is the LAN7431 has support
for external phy.
This change adds LAN7431 to the list of recognized devices
supported by this driver.
Updates for v2:
changed 'fixes' tag to match defined format
fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/microchip/lan743x_main.c | 1 +
drivers/net/ethernet/microchip/lan743x_main.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 5c3ac70abeca..aaedf1072460 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -3020,6 +3020,7 @@ static const struct dev_pm_ops lan743x_pm_ops = {
static const struct pci_device_id lan743x_pcidev_tbl[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7430) },
+ { PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7431) },
{ 0, }
};
diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h
index 0e82b6368798..2d6eea18973e 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.h
+++ b/drivers/net/ethernet/microchip/lan743x_main.h
@@ -548,6 +548,7 @@ struct lan743x_adapter;
/* SMSC acquired EFAR late 1990's, MCHP acquired SMSC 2012 */
#define PCI_VENDOR_ID_SMSC PCI_VENDOR_ID_EFAR
#define PCI_DEVICE_ID_SMSC_LAN7430 (0x7430)
+#define PCI_DEVICE_ID_SMSC_LAN7431 (0x7431)
#define PCI_CONFIG_LENGTH (0x1000)
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 089/123] netfilter: nf_tables: deactivate expressions in rule replecement routine
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (20 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 087/123] lan743x: Enable driver to work with LAN7431 Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 094/123] igb: fix uninitialized variables Sasha Levin
` (3 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel
Cc: Taehee Yoo, Pablo Neira Ayuso, Sasha Levin, netfilter-devel,
coreteam, netdev
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit ca08987885a147643817d02bf260bc4756ce8cd4 ]
There is no expression deactivation call from the rule replacement path,
hence, chain counter is not decremented. A few steps to reproduce the
problem:
%nft add table ip filter
%nft add chain ip filter c1
%nft add chain ip filter c1
%nft add rule ip filter c1 jump c2
%nft replace rule ip filter c1 handle 3 accept
%nft flush ruleset
<jump c2> expression means immediate NFT_JUMP to chain c2.
Reference count of chain c2 is increased when the rule is added.
When rule is deleted or replaced, the reference counter of c2 should be
decreased via nft_rule_expr_deactivate() which calls
nft_immediate_deactivate().
Splat looks like:
[ 214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[ 214.398983] Modules linked in: nf_tables nfnetlink
[ 214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44
[ 214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[ 214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[ 214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8
[ 214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202
[ 214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0
[ 214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78
[ 214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000
[ 214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba
[ 214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6
[ 214.398983] FS: 0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000
[ 214.398983] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0
[ 214.398983] Call Trace:
[ 214.398983] ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables]
[ 214.398983] ? __kasan_slab_free+0x145/0x180
[ 214.398983] ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables]
[ 214.398983] ? kfree+0xdb/0x280
[ 214.398983] nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables]
[ ... ]
Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions")
Reported by: Christoph Anton Mitterer <calestyo@scientia.net>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nf_tables_api.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 06ed55cef962..fe0558b15fd3 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2646,21 +2646,14 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk,
}
if (nlh->nlmsg_flags & NLM_F_REPLACE) {
- if (!nft_is_active_next(net, old_rule)) {
- err = -ENOENT;
- goto err2;
- }
- trans = nft_trans_rule_add(&ctx, NFT_MSG_DELRULE,
- old_rule);
+ trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule);
if (trans == NULL) {
err = -ENOMEM;
goto err2;
}
- nft_deactivate_next(net, old_rule);
- chain->use--;
-
- if (nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule) == NULL) {
- err = -ENOMEM;
+ err = nft_delrule(&ctx, old_rule);
+ if (err < 0) {
+ nft_trans_destroy(trans);
goto err2;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 094/123] igb: fix uninitialized variables
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (21 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 089/123] netfilter: nf_tables: deactivate expressions in rule replecement routine Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 095/123] ixgbe: recognize 1000BaseLX SFP modules as 1Gbps Sasha Levin
` (2 subsequent siblings)
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel; +Cc: Yunjian Wang, Jeff Kirsher, Sasha Levin, netdev
From: Yunjian Wang <wangyunjian@huawei.com>
[ Upstream commit e4c39f7926b4de355f7df75651d75003806aae09 ]
This patch fixes the variable 'phy_word' may be used uninitialized.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/intel/igb/e1000_i210.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c b/drivers/net/ethernet/intel/igb/e1000_i210.c
index c54ebedca6da..c393cb2c0f16 100644
--- a/drivers/net/ethernet/intel/igb/e1000_i210.c
+++ b/drivers/net/ethernet/intel/igb/e1000_i210.c
@@ -842,6 +842,7 @@ s32 igb_pll_workaround_i210(struct e1000_hw *hw)
nvm_word = E1000_INVM_DEFAULT_AL;
tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL;
igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE);
+ phy_word = E1000_PHY_PLL_UNCONF;
for (i = 0; i < E1000_MAX_PLL_TRIES; i++) {
/* check current state directly from internal PHY */
igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word);
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 095/123] ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (22 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 094/123] igb: fix uninitialized variables Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 096/123] rapidio/rionet: do not free skb before reading its length Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 097/123] net: hisilicon: remove unexpected free_netdev Sasha Levin
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel; +Cc: Josh Elsasser, Jeff Kirsher, Sasha Levin, netdev
From: Josh Elsasser <jelsasser@appneta.com>
[ Upstream commit a8bf879af7b1999eba36303ce9cc60e0e7dd816c ]
Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules,
allowing the core driver code to establish a link over this SFP type.
This is done by the out-of-tree driver but the fix wasn't in mainline.
Fixes: e23f33367882 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”)
Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers")
Signed-off-by: Josh Elsasser <jelsasser@appneta.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index a8148c7126e5..9772016222c3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -2248,7 +2248,9 @@ static s32 ixgbe_get_link_capabilities_X550em(struct ixgbe_hw *hw,
*autoneg = false;
if (hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core0 ||
- hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1) {
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1 ||
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core0 ||
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core1) {
*speed = IXGBE_LINK_SPEED_1GB_FULL;
return 0;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 096/123] rapidio/rionet: do not free skb before reading its length
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (23 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 095/123] ixgbe: recognize 1000BaseLX SFP modules as 1Gbps Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 097/123] net: hisilicon: remove unexpected free_netdev Sasha Levin
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel; +Cc: Pan Bian, David S . Miller, Sasha Levin, netdev
From: Pan Bian <bianpan2016@163.com>
[ Upstream commit cfc435198f53a6fa1f656d98466b24967ff457d0 ]
skb is freed via dev_kfree_skb_any, however, skb->len is read then. This
may result in a use-after-free bug.
Fixes: e6161d64263 ("rapidio/rionet: rework driver initialization and removal")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/rionet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c
index e9f101c9bae2..bfbb39f93554 100644
--- a/drivers/net/rionet.c
+++ b/drivers/net/rionet.c
@@ -216,9 +216,9 @@ static int rionet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
* it just report sending a packet to the target
* (without actual packet transfer).
*/
- dev_kfree_skb_any(skb);
ndev->stats.tx_packets++;
ndev->stats.tx_bytes += skb->len;
+ dev_kfree_skb_any(skb);
}
}
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH AUTOSEL 4.19 097/123] net: hisilicon: remove unexpected free_netdev
[not found] <20181205093555.5386-1-sashal@kernel.org>
` (24 preceding siblings ...)
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 096/123] rapidio/rionet: do not free skb before reading its length Sasha Levin
@ 2018-12-05 9:35 ` Sasha Levin
25 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2018-12-05 9:35 UTC (permalink / raw)
To: stable, linux-kernel; +Cc: Pan Bian, David S . Miller, Sasha Levin, netdev
From: Pan Bian <bianpan2016@163.com>
[ Upstream commit c758940158bf29fe14e9d0f89d5848f227b48134 ]
The net device ndev is freed via free_netdev when failing to register
the device. The control flow then jumps to the error handling code
block. ndev is used and freed again. Resulting in a use-after-free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/hisilicon/hip04_eth.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index 14374a856d30..6127697ede12 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -914,10 +914,8 @@ static int hip04_mac_probe(struct platform_device *pdev)
}
ret = register_netdev(ndev);
- if (ret) {
- free_netdev(ndev);
+ if (ret)
goto alloc_fail;
- }
return 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC Sasha Levin
@ 2018-12-07 6:40 ` Naresh Kamboju
2018-12-07 6:55 ` Naresh Kamboju
0 siblings, 1 reply; 28+ messages in thread
From: Naresh Kamboju @ 2018-12-07 6:40 UTC (permalink / raw)
To: Daniel Borkmann, Roman Gushchin
Cc: linux- stable, open list, guroan, ast, sashal, netdev
On Wed, 5 Dec 2018 at 15:08, Sasha Levin <sashal@kernel.org> wrote:
>
> From: Roman Gushchin <guroan@gmail.com>
>
> [ Upstream commit 569a933b03f3c48b392fe67c0086b3a6b9306b5a ]
>
> Naresh reported an issue with the non-atomic memory allocation of
> cgroup local storage buffers:
>
> [ 73.047526] BUG: sleeping function called from invalid context at
> /srv/oe/build/tmp-rpb-glibc/work-shared/intel-corei7-64/kernel-source/mm/slab.h:421
> [ 73.060915] in_atomic(): 1, irqs_disabled(): 0, pid: 3157, name: test_cgroup_sto
> [ 73.068342] INFO: lockdep is turned off.
> [ 73.072293] CPU: 2 PID: 3157 Comm: test_cgroup_sto Not tainted
> 4.20.0-rc2-next-20181113 #1
> [ 73.080548] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.0b 07/27/2017
> [ 73.088018] Call Trace:
> [ 73.090463] dump_stack+0x70/0xa5
> [ 73.093783] ___might_sleep+0x152/0x240
> [ 73.097619] __might_sleep+0x4a/0x80
> [ 73.101191] __kmalloc_node+0x1cf/0x2f0
> [ 73.105031] ? cgroup_storage_update_elem+0x46/0x90
> [ 73.109909] cgroup_storage_update_elem+0x46/0x90
>
> cgroup_storage_update_elem() (as well as other update map update
> callbacks) is called with disabled preemption, so GFP_ATOMIC
> allocation should be used: e.g. alloc_htab_elem() in hashtab.c.
>
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
I have reported above issue on 4.20.0-rc2-next-20181113.
Now this BUG re-occurring on 4.19.8-rc1 on x86_64 and arm64 devices.
[ 70.288592] BUG: sleeping function called from invalid context at
/srv/oe/build/tmp-rpb-glibc/work-shared/intel-corei7-64/kernel-source/mm/slab.h:421
[ 70.301992] in_atomic(): 1, irqs_disabled(): 0, pid: 3001, name:
test_cgroup_sto
[ 70.309424] INFO: lockdep is turned off.
[ 70.313416] CPU: 0 PID: 3001 Comm: test_cgroup_sto Not tainted 4.19.8-rc1 #1
[ 70.320483] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 70.327953] Call Trace:
[ 70.330402] dump_stack+0x70/0xa5
[ 70.333765] ___might_sleep+0x152/0x240
[ 70.337599] __might_sleep+0x4a/0x80
[ 70.341169] __kmalloc_node+0x1d1/0x300
[ 70.345003] ? cgroup_storage_update_elem+0x46/0x90
[ 70.349881] cgroup_storage_update_elem+0x46/0x90
[ 70.354585] map_update_elem+0x1fd/0x450
[ 70.358504] __x64_sys_bpf+0x129/0x270
[ 70.362258] do_syscall_64+0x55/0x190
[ 70.365923] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 70.370974] RIP: 0033:0x7f42e0ebb969
[ 70.374544] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d df e4 2b 00 f7 d8 64 89
01 48
[ 70.393281] RSP: 002b:00007ffde61a0a08 EFLAGS: 00000202 ORIG_RAX:
0000000000000141
[ 70.400845] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f42e0ebb969
[ 70.407971] RDX: 0000000000000048 RSI: 00007ffde61a0a50 RDI: 0000000000000002
[ 70.415094] RBP: 00007ffde61a0a20 R08: 00007ffde61a0a50 R09: 00007ffde61a0a50
[ 70.422216] R10: 00007ffde61a0a50 R11: 0000000000000202 R12: 0000000000000005
[ 70.429342] R13: 00007ffde61a0c10 R14: 0000000000000000 R15: 0000000000000000
selftests: bpf: test_cgroup_storage
Full test log links,
arm64 Juno
https://lkft.validation.linaro.org/scheduler/job/537820#L2971
x86_64 Supermicro SYS-5019S-ML/X11SSH-F
https://lkft.validation.linaro.org/scheduler/job/537772#L2724
Best regards
Naresh Kamboju
> ---
> kernel/bpf/local_storage.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> index 830d7f095748..fc1605aee5ea 100644
> --- a/kernel/bpf/local_storage.c
> +++ b/kernel/bpf/local_storage.c
> @@ -138,7 +138,8 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *_key,
> return -ENOENT;
>
> new = kmalloc_node(sizeof(struct bpf_storage_buffer) +
> - map->value_size, __GFP_ZERO | GFP_USER,
> + map->value_size,
> + __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN,
> map->numa_node);
> if (!new)
> return -ENOMEM;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC
2018-12-07 6:40 ` Naresh Kamboju
@ 2018-12-07 6:55 ` Naresh Kamboju
0 siblings, 0 replies; 28+ messages in thread
From: Naresh Kamboju @ 2018-12-07 6:55 UTC (permalink / raw)
To: Daniel Borkmann, Roman Gushchin
Cc: linux- stable, open list, guroan, ast, sashal, netdev
On Fri, 7 Dec 2018 at 12:10, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> On Wed, 5 Dec 2018 at 15:08, Sasha Levin <sashal@kernel.org> wrote:
> >
> > From: Roman Gushchin <guroan@gmail.com>
> >
> > [ Upstream commit 569a933b03f3c48b392fe67c0086b3a6b9306b5a ]
> >
> > Naresh reported an issue with the non-atomic memory allocation of
> > cgroup local storage buffers:
> >
> > [ 73.047526] BUG: sleeping function called from invalid context at
> > /srv/oe/build/tmp-rpb-glibc/work-shared/intel-corei7-64/kernel-source/mm/slab.h:421
> > [ 73.060915] in_atomic(): 1, irqs_disabled(): 0, pid: 3157, name: test_cgroup_sto
> > [ 73.068342] INFO: lockdep is turned off.
> > [ 73.072293] CPU: 2 PID: 3157 Comm: test_cgroup_sto Not tainted
> > 4.20.0-rc2-next-20181113 #1
> > [ 73.080548] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > 2.0b 07/27/2017
> > [ 73.088018] Call Trace:
> > [ 73.090463] dump_stack+0x70/0xa5
> > [ 73.093783] ___might_sleep+0x152/0x240
> > [ 73.097619] __might_sleep+0x4a/0x80
> > [ 73.101191] __kmalloc_node+0x1cf/0x2f0
> > [ 73.105031] ? cgroup_storage_update_elem+0x46/0x90
> > [ 73.109909] cgroup_storage_update_elem+0x46/0x90
> >
> > cgroup_storage_update_elem() (as well as other update map update
> > callbacks) is called with disabled preemption, so GFP_ATOMIC
> > allocation should be used: e.g. alloc_htab_elem() in hashtab.c.
> >
> > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> > Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > Signed-off-by: Sasha Levin <sashal@kernel.org>
>
> I have reported above issue on 4.20.0-rc2-next-20181113.
> Now this BUG re-occurring on 4.19.8-rc1 on x86_64 and arm64 devices.
This BUG: was seen on 4.19.1-rc1 also on x86_64 and arm64 devices.
- Naresh
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2018-12-07 6:55 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181205093555.5386-1-sashal@kernel.org>
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 013/123] tools: bpftool: prevent infinite loop in get_fdinfo() Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 017/123] netfilter: nf_conncount: use spin_lock_bh instead of spin_lock Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 018/123] netfilter: nf_conncount: fix list_del corruption in conn_free Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 019/123] netfilter: nf_conncount: fix unexpected permanent node of list Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 020/123] netfilter: nf_tables: don't skip inactive chains during update Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 023/123] netfilter: xt_RATEEST: remove netns exit routine Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 024/123] netfilter: nf_tables: fix use-after-free when deleting compat expressions Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 040/123] bpf: allocate local storage buffers using GFP_ATOMIC Sasha Levin
2018-12-07 6:40 ` Naresh Kamboju
2018-12-07 6:55 ` Naresh Kamboju
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 042/123] netfilter: xt_hashlimit: fix a possible memory leak in htable_create() Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 058/123] tools: bpftool: fix potential NULL pointer dereference in do_load Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 065/123] bpf: fix check of allowed specifiers in bpf_trace_printk Sasha Levin
2018-12-05 9:34 ` [PATCH AUTOSEL 4.19 067/123] ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 075/123] netfilter: ipv6: Preserve link scope traffic original oif Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 077/123] netfilter: add missing error handling code for register functions Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 078/123] netfilter: nat: fix double register in masquerade modules Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 079/123] netfilter: nf_conncount: remove wrong condition check routine Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 083/123] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 084/123] net: phy: add workaround for issue where PHY driver doesn't bind to the device Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 085/123] net: thunderx: fix NULL pointer dereference in nic_remove Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 086/123] lan743x: fix return value for lan743x_tx_napi_poll Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 087/123] lan743x: Enable driver to work with LAN7431 Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 089/123] netfilter: nf_tables: deactivate expressions in rule replecement routine Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 094/123] igb: fix uninitialized variables Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 095/123] ixgbe: recognize 1000BaseLX SFP modules as 1Gbps Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 096/123] rapidio/rionet: do not free skb before reading its length Sasha Levin
2018-12-05 9:35 ` [PATCH AUTOSEL 4.19 097/123] net: hisilicon: remove unexpected free_netdev Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).