From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Anoob Soman <anoob.soman@citrix.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.9 07/32] packet: Do not call fanout_release from atomic contexts
Date: Fri, 24 Feb 2017 09:37:51 +0100 [thread overview]
Message-ID: <20170224083747.417975686@linuxfoundation.org> (raw)
In-Reply-To: <20170224083746.364657938@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anoob Soman <anoob.soman@citrix.com>
[ Upstream commit 2bd624b4611ffee36422782d16e1c944d1351e98 ]
Commit 6664498280cf ("packet: call fanout_release, while UNREGISTERING a
netdev"), unfortunately, introduced the following issues.
1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside
rcu_read-side critical section. rcu_read_lock disables preemption, most often,
which prohibits calling sleeping functions.
[ ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
[ ]
[ ] rcu_scheduler_active = 1, debug_locks = 0
[ ] 4 locks held by ovs-vswitchd/1969:
[ ] #0: (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40
[ ] #1: (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
[ ] #2: (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20
[ ] #3: (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0
[ ]
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110
[ ] [<ffffffff810a2da7>] ___might_sleep+0x57/0x210
[ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
[ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
[ ] [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30
[ ] [<ffffffff81186e88>] ? printk+0x4d/0x4f
[ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock).
"sleeping function called from invalid context"
[ ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
[ ] INFO: lockdep is turned off.
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff810a2f52>] ___might_sleep+0x202/0x210
[ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
[ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
[ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
3. calling dev_remove_pack(&fanout->prot_hook), from inside
spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack()
-> synchronize_net(), which might sleep.
[ ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
[ ] INFO: lockdep is turned off.
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff81186274>] __schedule_bug+0x64/0x73
[ ] [<ffffffff8162b8cb>] __schedule+0x6b/0xd10
[ ] [<ffffffff8162c5db>] schedule+0x6b/0x80
[ ] [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410
[ ] [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810
[ ] [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10
[ ] [<ffffffff8154eab5>] synchronize_net+0x35/0x50
[ ] [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20
[ ] [<ffffffff8161077e>] fanout_release+0xbe/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
4. fanout_release() races with calls from different CPU.
To fix the above problems, remove the call to fanout_release() under
rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and
netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order
to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
__fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
fanout->prot_hook is removed as well.
Fixes: 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/packet/af_packet.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1497,6 +1497,8 @@ static void __fanout_link(struct sock *s
f->arr[f->num_members] = sk;
smp_wmb();
f->num_members++;
+ if (f->num_members == 1)
+ dev_add_pack(&f->prot_hook);
spin_unlock(&f->lock);
}
@@ -1513,6 +1515,8 @@ static void __fanout_unlink(struct sock
BUG_ON(i >= f->num_members);
f->arr[i] = f->arr[f->num_members - 1];
f->num_members--;
+ if (f->num_members == 0)
+ __dev_remove_pack(&f->prot_hook);
spin_unlock(&f->lock);
}
@@ -1693,7 +1697,6 @@ static int fanout_add(struct sock *sk, u
match->prot_hook.func = packet_rcv_fanout;
match->prot_hook.af_packet_priv = match;
match->prot_hook.id_match = match_fanout_group;
- dev_add_pack(&match->prot_hook);
list_add(&match->list, &fanout_list);
}
err = -EINVAL;
@@ -1718,7 +1721,12 @@ out:
return err;
}
-static void fanout_release(struct sock *sk)
+/* If pkt_sk(sk)->fanout->sk_ref is zero, this function removes
+ * pkt_sk(sk)->fanout from fanout_list and returns pkt_sk(sk)->fanout.
+ * It is the responsibility of the caller to call fanout_release_data() and
+ * free the returned packet_fanout (after synchronize_net())
+ */
+static struct packet_fanout *fanout_release(struct sock *sk)
{
struct packet_sock *po = pkt_sk(sk);
struct packet_fanout *f;
@@ -1728,17 +1736,17 @@ static void fanout_release(struct sock *
if (f) {
po->fanout = NULL;
- if (atomic_dec_and_test(&f->sk_ref)) {
+ if (atomic_dec_and_test(&f->sk_ref))
list_del(&f->list);
- dev_remove_pack(&f->prot_hook);
- fanout_release_data(f);
- kfree(f);
- }
+ else
+ f = NULL;
if (po->rollover)
kfree_rcu(po->rollover, rcu);
}
mutex_unlock(&fanout_mutex);
+
+ return f;
}
static bool packet_extra_vlan_len_allowed(const struct net_device *dev,
@@ -2970,6 +2978,7 @@ static int packet_release(struct socket
{
struct sock *sk = sock->sk;
struct packet_sock *po;
+ struct packet_fanout *f;
struct net *net;
union tpacket_req_u req_u;
@@ -3009,9 +3018,14 @@ static int packet_release(struct socket
packet_set_ring(sk, &req_u, 1, 1);
}
- fanout_release(sk);
+ f = fanout_release(sk);
synchronize_net();
+
+ if (f) {
+ fanout_release_data(f);
+ kfree(f);
+ }
/*
* Now the socket is dead. No more input will appear.
*/
@@ -3963,7 +3977,6 @@ static int packet_notifier(struct notifi
}
if (msg == NETDEV_UNREGISTER) {
packet_cached_dev_reset(po);
- fanout_release(sk);
po->ifindex = -1;
if (po->prot_hook.dev)
dev_put(po->prot_hook.dev);
next prev parent reply other threads:[~2017-02-24 8:38 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-24 8:37 [PATCH 4.9 00/32] 4.9.13-stable review Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 01/32] kcm: fix 0-length case for kcm_sendmsg() Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 02/32] kcm: fix a null pointer dereference in kcm_sendmsg() Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 03/32] net/mlx5e: Disable preemption when doing TC statistics upcall Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 04/32] net/llc: avoid BUG_ON() in skb_orphan() Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 05/32] net: ethernet: ti: cpsw: fix cpsw assignment in resume Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 06/32] packet: fix races in fanout_add() Greg Kroah-Hartman
2017-02-24 8:37 ` Greg Kroah-Hartman [this message]
2017-02-24 8:37 ` [PATCH 4.9 08/32] net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 09/32] dccp: fix freeing skb too early for IPV6_RECVPKTINFO Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 10/32] vxlan: fix oops in dev_fill_metadata_dst Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 11/32] irda: Fix lockdep annotations in hashbin_delete() Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 12/32] ptr_ring: fix race conditions when resizing Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 13/32] ip: fix IP_CHECKSUM handling Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 14/32] net: socket: fix recvmmsg not returning error from sock_error Greg Kroah-Hartman
2017-02-24 8:37 ` [PATCH 4.9 15/32] tty: serial: msm: Fix module autoload Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 16/32] USB: serial: mos7840: fix another NULL-deref at open Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 17/32] USB: serial: cp210x: add new IDs for GE Bx50v3 boards Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 18/32] USB: serial: ftdi_sio: fix modem-status error handling Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 19/32] USB: serial: ftdi_sio: fix extreme low-latency setting Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 20/32] USB: serial: ftdi_sio: fix line-status over-reporting Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 21/32] USB: serial: digi_acceleport: fix OOB data sanity check Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 22/32] USB: serial: spcp8x5: fix modem-status handling Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 23/32] USB: serial: opticon: fix CTS retrieval at open Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 24/32] USB: serial: ark3116: fix register-accessor error handling Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 25/32] USB: serial: console: fix uninitialised spinlock Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 26/32] x86/platform/goldfish: Prevent unconditional loading Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 27/32] goldfish: Sanitize the broken interrupt handler Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 28/32] netfilter: nf_ct_helper: warn when not applying default helper assignment Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 29/32] ACPICA: Linuxize: Restore and fix Intel compiler build Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 30/32] block: fix double-free in the failure path of cgwb_bdi_init() Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 31/32] rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down Greg Kroah-Hartman
2017-02-24 8:38 ` [PATCH 4.9 32/32] xfs: clear delalloc and cache on buffered write failure Greg Kroah-Hartman
2017-02-24 16:40 ` [PATCH 4.9 00/32] 4.9.13-stable review Guenter Roeck
2017-02-24 18:16 ` Shuah Khan
[not found] ` <58b03e91.d7052e0a.891bc.582f@mx.google.com>
[not found] ` <m2efyk39kv.fsf@baylibre.com>
2017-02-26 22:53 ` Alexandre Belloni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170224083747.417975686@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=anoob.soman@citrix.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).