Patch "net: sched: fix call_rcu() race on classifier module unloads" has been added to the 4.0-stable tree

All of lore.kernel.org
 help / color / mirror / Atom feed

* Patch "net: sched: fix call_rcu() race on classifier module unloads" has been added to the 4.0-stable tree
@ 2015-06-13 17:02 gregkh
  0 siblings, 0 replies; only message in thread
From: gregkh @ 2015-06-13 17:02 UTC (permalink / raw)
  To: daniel, ast, davem, edumazet, gregkh, jhs, john.r.fastabend,
	subramanian.vijay, tgraf
  Cc: stable, stable-commits

This is a note to let you know that I've just added the patch titled

    net: sched: fix call_rcu() race on classifier module unloads

to the 4.0-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-sched-fix-call_rcu-race-on-classifier-module-unloads.patch
and it can be found in the queue-4.0 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.

>From foo@baz Sat Jun 13 09:48:35 PDT 2015
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed, 20 May 2015 17:13:33 +0200
Subject: net: sched: fix call_rcu() race on classifier module unloads

From: Daniel Borkmann <daniel@iogearbox.net>

[ Upstream commit c78e1746d3ad7d548bdf3fe491898cc453911a49 ]

Vijay reported that a loop as simple as ...

  while true; do
    tc qdisc add dev foo root handle 1: prio
    tc filter add dev foo parent 1: u32 match u32 0 0  flowid 1
    tc qdisc del dev foo root
    rmmod cls_u32
  done

... will panic the kernel. Moreover, he bisected the change
apparently introducing it to 78fd1d0ab072 ("netlink: Re-add
locking to netlink_lookup() and seq walker").

The removal of synchronize_net() from the netlink socket
triggering the qdisc to be removed, seems to have uncovered
an RCU resp. module reference count race from the tc API.
Given that RCU conversion was done after e341694e3eb5 ("netlink:
Convert netlink_lookup() to use RCU protected hash table")
which added the synchronize_net() originally, occasion of
hitting the bug was less likely (not impossible though):

When qdiscs that i) support attaching classifiers and,
ii) have at least one of them attached, get deleted, they
invoke tcf_destroy_chain(), and thus call into ->destroy()
handler from a classifier module.

After RCU conversion, all classifier that have an internal
prio list, unlink them and initiate freeing via call_rcu()
deferral.

Meanhile, tcf_destroy() releases already reference to the
tp->ops->owner module before the queued RCU callback handler
has been invoked.

Subsequent rmmod on the classifier module is then not prevented
since all module references are already dropped.

By the time, the kernel invokes the RCU callback handler from
the module, that function address is then invalid.

One way to fix it would be to add an rcu_barrier() to
unregister_tcf_proto_ops() to wait for all pending call_rcu()s
to complete.

synchronize_rcu() is not appropriate as under heavy RCU
callback load, registered call_rcu()s could be deferred
longer than a grace period. In case we don't have any pending
call_rcu()s, the barrier is allowed to return immediately.

Since we came here via unregister_tcf_proto_ops(), there
are no users of a given classifier anymore. Further nested
call_rcu()s pointing into the module space are not being
done anywhere.

Only cls_bpf_delete_prog() may schedule a work item, to
unlock pages eventually, but that is not in the range/context
of cls_bpf anymore.

Fixes: 25d8c0d55f24 ("net: rcu-ify tcf_proto")
Fixes: 9888faefe132 ("net: sched: cls_basic use RCU")
Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/cls_api.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -81,6 +81,11 @@ int unregister_tcf_proto_ops(struct tcf_
 	struct tcf_proto_ops *t;
 	int rc = -ENOENT;

+	/* Wait for outstanding call_rcu()s, if any, from a
+	 * tcf_proto_ops's destroy() handler.
+	 */
+	rcu_barrier();
+
 	write_lock(&cls_mod_lock);
 	list_for_each_entry(t, &tcf_proto_base, head) {
 		if (t == ops) {

Patches currently in stable-queue which might be from daniel@iogearbox.net are

queue-4.0/x86-bpf_jit-fix-compilation-of-large-bpf-programs.patch
queue-4.0/sctp-fix-mangled-ipv4-addresses-on-a-ipv6-listening-socket.patch
queue-4.0/net-sched-fix-call_rcu-race-on-classifier-module-unloads.patch
queue-4.0/tcp-fix-child-sockets-to-use-system-default-congestion-control-if-not-set.patch

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2015-06-13 17:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-13 17:02 Patch "net: sched: fix call_rcu() race on classifier module unloads" has been added to the 4.0-stable tree gregkh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.